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Description 

This invention relates to the use of bacteriophage T7-type DNA polymerases in a method of amplification 
of a DNA sequence. 

5 DNA sequencing involves the generation of four populations of single stranded DNA fragments having one 

defined terminus and one variable terminus. The variable terminus always terminates at a specific given nuc- 
leotide base (either guanine (G), adenine (A), thymine (T), or cytosine (C)). The four different sets of fragments 
are each separated on the basis of their length, on a high resolution polyacryl amide gel; each band on the gel 
corresponds colinearly to a specific nucleotide in the DNA sequence, thus identifying the positions in the se- 

10 quence of the given nucleotide base. 

Generally there are two methods of DNA sequencing. One method (Maxam and Gilbert sequencing) in- 
volves the chemical degradation of isolated DNA fragments, each labelled with a single radiolabel at its defined 
terminus, each reaction yielding a limited cleavage specifically at one ore more of the four bases (G, A, T or 
C). The other method (dideoxy sequencing) involves the enzymatic synthesis of a DNA strand. Four separate 

is syntheses are run, each reaction being caused to terminate at a specific base (G, A, T or C) via incorporation 
of the appropriate chain terminating dideoxynucleotide. The latter method is preferred since the DNA fragments 
are uniformly labelled (instead of end labelled) and thus the larger DNA fragments contain increasingly more 
radioactivity. Further, ^S-labelled nucleotides can be used in place of 32 P-labelled nucleotides, resulting in shar- 
per definition; and the reaction products are simple to interpret since each lane corresponds only to either G, 

20 A, T or C. The enzyme used for most dideoxy sequencing is the Escherichia coli DNA-polymerase I large frag- 
ment ("Klenow"). Another polymerase used is AMV reverse transcriptase. 

Summary of the Invention 

25 The invention features a method of amplification of a DNA sequence comprising annealing a first and sec- 
ond primer to opposite strands of a double stranded DNA sequence and incubating the annealed mixture with 
a processive bacteriophage T7-type DNA polymerase (also refered to hereinafter as T7-type DNA polymer- 
ase") having less than 500 units of exonuclease activity per mg of polymerase, preferably less than I unit, whe- 
rein the first and second primers anneal to opposite strands of the DNA sequence; in preferred embodiments 

30 the primers have their 3' ends directed toward each other; and the method further comprises, after the incu- 
bation step, denaturing the resulting DNA, annealing the first and second primers to the resulting DNA and in- 
cubating the annealed mixture with the polymerase; preferably the cycle of denaturing, annealing and 
incubating is repeated from 10 to 40 times. 

This invention provides a bacteriophage T7-type DNA polymerase which is processive, non-discriminating, 

35 and can utilize short primers. Further, the polymerase has less than 50% of the exonuclease activity of the 
naturally associated level of exonuclease activity of said polymerase. These are ideal properties for the above 
described method. 

Other features and advantages of the invention will be apparent from the following description of the pre- 
ferred embodiments thereof and from the claims. 

40 

Description of the Preferred Embodiments 
The drawings will first briefly be described. 
45 Drawings 

Figs. 1-3 are diagrammatic representations of the vectors pTrx-2, mGPl-1, and pGP5-5 respectively; 
Fig. 4 is a graphical representation of the selective oxidation of T7 DNA polymerase; 
Fig. 5 is a graphical representation of the ability of modified T7 polymerase to synthesize DNA in the pre- 
50 sence of etheno-dATP; and 

Fig. 6 is a diagrammatic representation of the enzymatic amplification of genomic DNA using modified T7 
DNA polymerase. 

Fig. 7, 8 and 9 are the nucleotide sequences of pTrx-2, a part of pGP5-5 and mGPI-2 respectively. 
Fig. 10 is a diagrammatic representation of pGP5-6. 

55 

DNA Polymerase 



The bacteriophage T7-type DNA polymerase of this invention, which is substantially the same as the one 
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in cells infected with a T7-type phage (i.e., phage in which the DNA polymerase requires host thioredoxin as 
a subunit; for example, the T7-type phage is T7, T3, <M, Oil, H, W31 , gh-l, Y, AII22, or SP6 t Studier, 95 Virology 
70, 1979), is processive, has less than 50% of the exonuclease activity of the naturally associated level of 
exonudease activity of said polymerase, does not discriminate against nucleotide analog incorporation, and 
5 can utilize small oligonucleotides (such as tetramers, hexamers and octamers) as specific primers. These 
properties will now be discussed in detail. 

Processivity 

10 By processivity is meant that the DNA polymerase is able to continuously incorporate many nucleotides 
using the same primer-template without dissociating from the template, under conditions normally used for DNA 
sequencing extension reactions. The degree of processivity varies with different polymerases: some incorpo- 
rate only a few bases before dissociating (eg. Klenow (about 15 bases), T4 DNA polymerase (about 10 bases), 
T5 DNA polymerase (about I80 bases) and reverse transcriptase (about 200 bases) (Das et al. J. Biol. Chem. 

is 254:1227 1979; Bambara et al., J. Biol. Chem 253:413, 1978) white others, such as those of the present invention, 
will remain bound for at least 500 bases and preferably at least 1,000 bases under suitable environmental con- 
ditions. Such environmental conditions include having adequate supplies of all four deoxy nucleoside triphos- 
phates and an incubation temperature from 10°C-50°C. Processivity is greatly enhanced in the presence of E. 
coli single stranded binding (ssb) , protein. 

20 With processive enzymes termination of a sequencing reaction will occur only at those bases which have 
incorporated a chain terminating agent, such as a d id eoxy nucleotide. If the DNA polymerase is non-process ive, 
then artifactual bands will arise during sequencing reactions, at positions corresponding to the nucleotide 
where the polymerase dissociated. Frequent dissociation creates a background of bands at incorrect positions 
and obscures the true DNA sequence. This problem is partially corrected by incubating the reaction mixture 

25 for a long time (30-60 min) with a high concentration of substrates, which "chase" the artifactual bands up to 
a high molecular weight at the top of the gel, away from the region where the DNA sequence is read. This is 
not an ideal solution since a non-processive DNA polymerase has a high probability of dissociating from the 
template at regions of compact secondary structure, or hairpins. Reinitiation of primer elongation at these sites 
is inefficient and the usual result is the formation of bands at the same position for all four nucleotides, thus 

30 obscuring the DNA sequence. 

Analog discrimation 

The DNA polymerases of this invention do not discriminate significantly between dideoxy-nucleotide 
35 analogs and normal nucleotides. That is, the chance of incorporation of an analog is approximately the same 
as that of a normal nucleotide or at least incorporates the analog with at least I/I0 the efficiency that of a normal 
analog. The polymerases of this invention also do not discriminate significantly against some other analogs. 
This is important since, in addition to the four normal deoxynudeoside triphosphates (dGTP, dATP, dTTP and 
dCTP), sequendng reactions require the incorporation of other types of nucleotide derivatives such as; radioac- 
40 tively- or f luorescently-labelled nudeoside triphosphates, usually for labeling the synthesized strands with M S, 
szp, or other chemical agents. When a DNA polymerase does not discriminate against analogs the same prob- 
ability will exist for the incorporation of an analog as for a normal nudeotide. For labelled nudeoside triphos- 
phates this is important in order to efficiently label the synthesized DNA strands using a minimum of 
radioactivity. Further, lower levels of analogs are required with such enzymes, making the sequencing reaction 
45 cheaper than with a discriminating enzyme. 

Discriminating polymerases show a different extent of discrimination when they are polymerizing in a pro- 
cessive mode versus when stalled, struggling to synthesize through a secondary structure impediment At such 
impediments there will be a variability in the intensity of different radioactive bands on the gel, which may 
obscure the sequence. 

so 

Exonudease Activity 

The DNA polymerase of the invention has less than 50%, preferably less than 1%, and most preferably 
less than 0.1%, of the normal or naturally associated level of exonudease activity (amount of activity per 
55 polymerase molecule). By normal or naturally associated level is meant the exonudease activity of unmodified 
T7-type polymerase. Normally the associated activity is about 5,000 units of exonudease activity per mg of 
polymerase, measured as described below by a modification of the procedure of Chase et al. (249 J. Biol. 
Chem. 4545, 1974). Exonudeases increase the fidelity of DNA synthesis by excising any newly synthesized 
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bases which are incorrectly base paired to the template. Such associated exonuclease activities are detrimental 
to the quality of DNA sequencing reactions. They raise the minimal required concentration of nucleotide pre- 
cursors which must be added to the reaction since, when the nucleotide concentration fells, the polymerase 
activity slows to a rate comparable with the exonuclease-activity, resulting in no net DNA synthesis, or even 

5 degradation of the synthesized DNA. 

More importantly, associated exonuclease activity will cause a DNA polymerase to idle at regions in the 
template with secondary structure impediments. When a polymerase approaches such a structure its rate of 
synthesis decreases as it struggles to pass. An associated exonuclease will excise the newly synthesized DNA 
when the polymerase stalls. As a consequence numerous cycles of synthesis and excision will occur. This may 

10 result in the polymerase eventually synthesizing past the hairpin (with no detriment to the quality of the sequenc- 
ing reaction); or the polymerase may dissociate from the synthesized strand (resulting in an artifactual band 
at the same position in all four sequencing reactions); or, a chain terminating agent may be incorporated at a 
high frequency and produce a wide variability in the intensity of different fragments in a sequencing gel. This 
happens because the frequency of incorporation of a chain terminating agent at any given site increases with 

15 the number of opportunities the polymerase has to incorporate the chain terminating nucleotide, and so the 
DNA polymerase will incorporate a chain-terminating agent at a much higher frequency at sites of idling than 
at other sites. 

An ideal sequencing reaction will produce bands of uniform intensity throughout the gel. This is essential 
for obtaining the optima) exposure of the X-ray film for every radioactive fragment If there is variable intensity 
20 of radioactive bands, then fainter bands have a chance of going undetected. To obtain uniform radioactive in- 
tensity of all fragments, the DNA polymerase should spend the same interval of time at each position on the 
DNA, showing no preference for either the additon or removal of nucleotides at any given site. This occurs if 
the DNA polymerase tacks any associated exonuclease, so that it will have only one opportunity to incorporate 
a chain terminating nucleotide at each position along the template. 

25 

Short primers 

The DNA polymerase of the invention is able to utilize primers of 10 bases or less, as well as longer ones, 
most preferably of 4-20 bases. The ability to utilize short primers offers a number of important advantages to 

30 DNA sequencing. The shorter primers are cheaper to buy and easier to synthesize than the usual 15-20-mer 
primers. They also an neal faster to complementary sites on a DNA template, thus making the sequencing reac- 
tion faster. Further, the ability to utilize small (e.g., six or seven base) oligonucleotide primers for DNA sequenc- 
ing permits strategies not otherwise possible for sequencing long DNA fragments. For example, a kit containing 
80 random hexamers could be generated, none of which are complementary to any sites in the cloning vector. 

35 Statistically, one of the 80 hexamer sequences will occur an average of every 50 bases along the DNA fragment 
to be sequenced. The determination of a sequence of 3000 bases would require only five sequencing cycles. 
First, a "universal" primer (e.g., New England Biolabs #1211, sequence 5' GTAAAACGACGGCCAGT 3') would 
be used to sequence about 600 bases at one end of the insert Using the results from this sequencing reaction, 
a new primer would be picked from the kit homologous to a region near the end of the determined sequence. 

40 In the second cycle, the sequence of the next 600 bases would be determined using this primer. Repetition of 
this process five times would determine the complete sequence of the 3000 bases, without necessitating any 
subcloning, and without the chemical synthesis of any new oligonucleotide primers. The use of such short prim- 
ers may be enhanced by including gene 2.5 and 4 protein of T7 in the sequencing reaction. 

DNA polymerases of this invention, (i.e., having the above properties) include modified bacteriophage T7- 

45 type polymerases. That is the DNA polymerase requires host thioredoxin as a sub-unit, and they are substan- 
tially identical to a modified bacteriophage T7 DNA polymerase or to equivalent enzymes isolated from related 
phage, such as T3, <W, Oil, H, W31, gh-l, Y, AII22 and SP6. Each of these enzymes can be modified to have 
properties similar to those of the modified bacteriophage T7 enzyme. It is possible to isolate the enzyme from 
phage infected cells directly, but preferably the enzyme is isolated from cells which overproduce it By substan- 

so tially identical is meant that the enzyme may have amino acid substitutions which do not affect the overall 
properties of the enzyme. One example of a particularly desirable amino acid substitution is one in which the 
natural enzyme is modified to remove any exonuclease activity. This modification may be performed at the 
genetic or chemical level (see below). 

55 Cloning T7 polymerase 

As an example of the invention we shall describe the cloning, overproduction, purification, modification and 
use of T7 DNA polymerase. This processive enzyme consists of two polypeptides tightly complexed in a one 
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to one stoichiometry. One is the phage T7-encoded gene 5 protein of 84,000 daltons (Modrich et al. 150 J. Biol. 
Chem. 5515, 1975), the other is the E. coli encoded thioredoxin, of 12,000 daltons (Tabor et al., J. Biol, Chem. 
262:16, 216, 1987). The thioredoxin is an accessory protein and attaches the gene 5 protein (the non-processive 
actual DNA polymerase) to the primer template. The natural DISIA polymerase has a very active 3' to 5' exonuc- 
5 lease associated with it This activity makes the polymerase useless for DNA sequencing and must be inacti- 
vated or modified before the polymerase can be used. This is readily performed, as described below, either 
chemically, by local oxidation of the exonuclease domain, or genetically, by modifying the coding region of the 
polymerase gene encoding this activity. 

io pTrx-2 

In order to clone the trxA (thioredoxin) gene of E. coli wild type E. coil DNA was partially cleaved with Sau3A 
and the fragments ligated to Bam HI-cleaved T7 DNA isolated from strain T7 ST9 (Tabor et al., in Thioredoxin 
and Glutaredoxin Systems: Structure and Function (Holmgren et al., eds) pp. 285-300, Raven Press, NY; and 
15 Tabor et al., supra) . The ligated DNA was transf ected into E. coli trxA" cells, the mixture plated onto trxA" cells, 
and the resulting T7 plaques picked. Since T7 cannot grow without an active E. colitrxA gene only those phages 
containing the trxA gene could form plaques. The cloned trxA genes were located on a 470 base pair Hindi 
fragment 

In order to overproduce thioreodoxin a plasmid, pTrx-2, was as constructed. Briefly, the 470 base pair Hin- 
20 ell fragment containing the trxA gene was isolated by standard procedure (Maniatis et al., Cloning: A Laboratory 
Manual, Cold Spring Harbor Labs., Cold Spring Harbor, N.Y.), and ligated to a derivative of pBR322 containing 
a Ptac promoter (ptac-12, Amann et al., 25 Gene I67, 1983). Referring to Fig. 2, ptac-12, containing p-lactamase 
and Col El origin, was cut with Pvull, to yield a fragment of 2290 bp, which was then ligated to two tandem 
copies of trxA (Hindi fragment) using commerdally available linkers (Smal-BamHI polylinker), to form pTrx-2. 
25 The complete nudeotide sequence of pTrx-2 is shown in Figure 7. Thioredoxin production is now under the 
control of the tac promoter, and thus can be specifically induced, e.g. by IPTG (isopropyl 0-D-thiogalactoside). 

pGP5-5 and mGP1-2 

30 Some gene products of T7 are lethal when expressed in E. coli. An expression system was developed to 
facilitate doning and expression of, lethal genes, based on the inducible expression of T7 RNA polymerase. 
Gene 5 protein is lethal in some E. cdi strains and an example of such a system is described by Tabor at al. 
82 Proc. Nat Acad. Sci. 1074 (1985) where T7 gene 5 was placed under the control of the ®1 0 promoter, and 
is only expressed when T7 RNA polymerase is present in the cell. 

35 Briefly, pGP5-5 (Fig. 3) was constructed by standard procedures using synthetic BamHI linkers to join T7 
fragment from I4306 (Ndel) to I6869 (Ahal ll). containing gene 5, to the 560 bp fragment of T7 from 5667 (Hindi) 
to 6l66(Fnu4H1) containing boththe01.1Aand01.1B promoters, which are recognized by T7 RNA polymerase 
and the 3kb BamHI-Hincll fragment of pACYC177 (Chang et al., I34 J. Bacterid. II4I, I978). The nudeotide se- 
quence of the T7 inserts and linkers in shown in Fig. 8. In this plasmid gene 5 is only expressed when T7 RNA 

40 polymerase is provided in the cell. 

Referring to Fig. 3, T7 RNA polymerase is provided on phage vector mGP1-2. This is similar to pGP1-2 
(Tabor etal., id.) exceptthatthe fragmentof T7from 3I33 (Haelll) to 5840 (Hinfl), containing T7 RNA polymerase 
was ligated, using linkers (Bgl ll and SaJI respedively), to BamHI-SaJI cut Ml3 mp8, placing the polymerase gene 
under control of the lac promoter. The complete nudeotide sequence of mGPI-2 is shown in Fig. 9. 

45 Since pGP5-5 and pTrx-2 have different origins of replication (respectively a P1 5A and a ColE 1 origin) they 
can be tranformed into one cell simultaneously. pTrx-2 expresses large quantities of thioredoxin in the presence 
of IPTG. mGP1-2 can coexist in the same cell as these two plasmids and be used to regulate expression of 
T7-DNA polymerase from pGP5-5, simply by causing produdion of T7-RNA polymerase by inducing the lac 
promoter with, e.g., IPTG. 

so 

Overprodudion of T7 DNA polymerase 

There are several potential strategies for overproducing and reconstituting the two gene products of trxA 
and gene 5. The same cell strains and plasmids can be utilized for all the strategies. In the preferred strategy 
55 the two genes are co-overexpressed in the same cell. (This is because gene 5 is susceptible to proteases until 
thioredoxin is bound to it.) As described in detail below, one procedure is to place the two genes separately on 
each of two compatible plasmids in the same cell. Alternatively, the two genes could be placed in tandem on 
the same plasmid. It is important that the T7-gene 5 is placed under the control of a non-leaky inducible pro- 
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moter, such as 01 .1 A, <X>1 .1 B and <D1 0 of T7, as the synthesis of even small quantities of the two polypeptides 
together is toxic in most E. coji cells. By non-leaky is meant that less than 500 molecules of the gene product 
are produced, per cell generation time, from the gene when the promoter, controlling the gene's expression, is 
not activated. Preferably the T7 RNA polymerase expression system is used although other expression systems 

5 which utilize inducible promoters could also be used. A leaky promoter, e.g., plac, allows more than 500 
molecules of protein to be synthesized, even when not induced, thus cells containing lethal genes under the 
control of such a promoter grow poorly and are not su itable in this invention. It is of course possible to produce 
these products in cells where they are not lethal, for example, the plac promoter is suitable in such cells. 
In a second strategy each gene can be cloned and overexpressed separately. Using this strategy, the cells 

10 containing the individually overproduced polypeptides are combined prior to preparing the extracts, at which 
point the two polypeptides form an active T7 DNA polymerase. 

Example 1: Production of T7 DNA polymerase 

15 E. colistrain 7I.I8 (Messing et al., Proc. Nat. Acad. Sci. 74:3642, 1977) is used for preparing stocks of mGP1- 
2. 71.18 is stored in 50% glycerol at -80°C. and is streaked on a standard minimal media agar plate. A single 
colony is grown overnight in 25 ml standard M9 media at 37°C, and a single plaque of mGP1-2 is obtained by 
titering the stock using freshly prepared 7I.I8 cells. The plaque is used to inoculate 10 ml 2X LB (2% Bacto- 
Tryptone, 1% yeast extract, 0.5% NaCi, 6mM NaOH) containing JMI03 grown to an ^=0.5. This culture will 

20 provide the phage stock for preparing a large culture of mGP1-2. After 3-12 hours, the 10 ml culture is cen- 
trifuged, and the supernatant used to infect the large (2L) culture. For the targe culture, 4 X 500 ml 2X LB is 
inoculated with 4 X 5 ml 71.18 cells grown in M9, and is shaken at 37°C. When the large culture of cells has 
grown to an A^^l .0 (approximately three hours), they are inoculated with 1 0 ml of supernatant containing the 
starter lysate of mGP1-2. The infected cells are then grown overnight at 37°C. The next day, the cells are re- 

25 moved by centrifugation, and the supernatant is ready to use for induction of K38/pGP5-5/pTrx-2 (see below). 
The supernatant can be stored at 4°C for approximately six months, at a titer -5X10° O/ml. At this titer, 1 L 
of phage will infect 12 liters of cells at an A^S with a multiplicity of infection of 1 5. If the titer is low, the mGP1 -2 
phage can be concentrated from the supernatant by dissolving NaCI (60 gm/liter) and PEG-6000 (65 gm/liter) 
in the supernatant allowing the mixture to settle at 0°C for 1-72 hours, and then centrifuging (7000 rpm for 20 

30 min). The precipitate, which contains the mGP1-2 phage, is resuspended in approximately 1/20th of the original 
volume of M9 media. 

K38/pGP5-5/pTrx-2 is the E. cpji strain (genotype Hfrc (X)) containing the two compatible plasmids pGP5-5 
and pTrx-2. pGP5-5 plasmid has a P15A origin of replication and expresses the kanamycin (Km) resistance 
gene. pTrx-2 has a ColEI origin of replication and expresses the ampicillin (Ap) resistance gene. The plasmids 

35 are introduced into K38 by standard procedures, selecting Km R and Ap R respectively. The cells K38/pGP5- 
5/pTrx-2 are stored in 50% glycerol at -80°C. Prior to use they are streaked on a plate containing 50ug/ml 
ampicillin and kanamycin, grown at 37°C overnight, and a single colony grown in 10 ml LB media containing 
5Qug/ml ampicillin and kanamycin, at 37°C for 4-6 hours. The 10 ml cell culture is used to inoculate 500 ml of 
LB media containing 50ug/ml ampicillin and kanamycin and shaken at 37°C overnight. The following day, the 

40 500 ml culture is used to inoculate 12 liters of 2X LB-KP0 4 media (2% Bacto-Tryptone, 1% yeast extract, 0.5% 
NaCI, 20 mM KP0 4 , 0.2% dextrose, and 0.2% casamino acids, pH 7.4), and grown with aeration in a fermentor 
at 37°C. When the cells reach an A69o=5.0 (i.e. logarithmic or stationary phase cells), they are infected with 
mGP1-2 at a multiplicity of infection of 10, and IPTG is added (final concentration 0.5mM). The IPTG induces 
production of thioredoxin and the T7 RNA polymerase in mGPI-2, and thence induces production of the cloned 

45 DNA polymerase. The cells are grown for an additional 2.5 hours with stirring and aeration, and then harvested. 
The cell pellet is resuspended in 1.5 L 10% sucrose/20 mM Tris-HCI, pH 8.0/25 mM EDTAand re-spun. Finally, 
the cell pellet is resuspended in 200 ml 10% sucrose/20 mM Tris-HCI, pH 8/I.0 mM EDTA, and frozen in liquid 
N 2 . From 12 liters of induced cells 70 gm of cell paste are obtained containing approximately 700 mg gene 5 
protein and 100 mg thioredoxin. 

50 K38/pTrx-2 (K38 containing pTrx-2 alone) overproduces thioredoxin, and it is added as a "booster" to ext- 
racts of K38/pGP5-5/pTrx-2 to insure that thioredoxin is in excess over gene 5 protein at the outset of the puri- 
fication. The K38/pTrx-2 cells are stored in 50% glycerol at -80°C. Prior to use they are streaked on a plate 
containing 50 ug/ml ampicillin, grown at 37°C for 24 hours, and a single colony grown at 37°C overnight in 25 
ml LB media containing 50 ug/ml ampicillin. The 25 ml culture is used to inoculate 2 Lof 2X LB media and shaken 

55 at 37°C. When the cells reach an Ae^.O, the ptac promoter, and thus thioredoxin production, is induced by 
the addition of IPTG (final concentration 0.5 mM). The cells are grown with shaking for an additional 12-16 hours 
at 37°C, harvested, resuspended in 600 ml 10% sucrose/20 mM Tris-HCI, pH 8.0/25 mM EDTA, and re-spun. 
Finally, the cells are resuspended in 40 ml 10% sucrose/20 mM Tris-HCI, pH 8/0.5 mM EDTA, and frozen in 
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liquid N 2 . From 2L of cells 16 gm of cell paste are obtained containing 150 mg of thioredoxin. 

Assays for the polymerase involve the use of single-stranded calf thymus DNA (6mM) as a substrate. This 
is prepared immediately prior to use by denaturation of double-stranded calf thymus DNA with 50 mM NaOH 
at 20°C for 15 min., followed by neutralization with HCI. Any purified DNA can be used as a template for the 

5 polymerase assay, although preferably it will have a length greater than 1,000 bases. 

The standard T7 DNA polymerase assay used is a modification of the procedure described by Grippo et 
al. (246 J. Biol. Chem. 6867, 1971). The standard reaction mix (200 ul final volume) contains 40 mM Tris/HCI 
pH 7.5, 10 mM MgCI 2 . 5 mM dithiothreitol, 100 nmol alkali-denatured calf thymus DNA, 0.3 mM dGTP, dATP, 
dCTP and PH]dTTP (20 cpm/pm), 50 jig/ml BSA, and varying amounts of T7 DNA polymerase. Incubation is 

10 at 37°C (I0°C-45 Q C) for 30 min (5 min-60 min). The reaction is stopped by the addition of 3 ml of cold (0°C) I 
N HCI-0.I M pyrophosphate. Acid-insoluble radioactivity is determined by the procedure of Hinkle et al. (250 J. 
Biol. Chem. 5523, 1974). The DNA is precipitated on ice for 15 min (5 min-12 hr), then precipitated onto glass-fiber 
filters by filtration. The filters are washed five times with 4 ml of cold (0°C) O.IM HCI-0.IM pyrophosphate, and 
twice with cold (0°C) 90% ethanol. After drying, the radioactivity on the filters is counted using a non-aqueous 

15 scintillation f luor. 

One unit of polymerase activity catalyzes the incorporation of 10 nmol of total nucleotide into an acid-soluble 
form in 30 min at 37°C, under the conditions given above. Native T7 DNA polymerase and modified T7 DNA 
polymerase (see below) have the same specific polymerase activity ± 20%, which ranges between 5,000- 
20,000 units/mg for native and 5,000-50,000 units/mg for modified polymerase) depending upon the prepa- 

20 ration, using the standard assay conditions stated above. 

T7 DNA polymerase is purified from the above extracts by precipitation and chromatography techniques. 
An example of such a purification follows. 

An extract of frozen cells (200 ml K38/pGP5-57pTrx-2 and 40 ml K38/pTrx-2) are thawed at 0°C overnight. 
The cells are combined, and 5 ml of lysozyme (15 mg/ml) and 10 ml of NaCI (5M) are added. After 45 min at 

25 0°C, the cells are placed in a 37°C water bath until their temperature reaches 20°C. The cells are then frozen 
in liquid N 2 . An additional 50 ml of NaCI (5M) is added, and the cells are thawed in a 37°C water bath. After 
thawing, the cells are gently mixed at 0°C for 60 min. The lysate is centrifuged for one hr at 35,000 rpm in a 
Beckman 45T1 rotor. The supernatant (250 ml) is fraction I. It contains approximately 700 mg gene 5 protein 
and 250 mg of thioredoxin (a 2:1 ratio thioredoxin to gene 5 protein). 

30 90 gm of ammonium sulphate is dissolved in fraction I (250 ml) and stirred for 60 min. The suspension is 
allowed to sit for 60 min, and the resulting precipitate collected by centrifugation at 8000 rpm for 60 min. The 
precipitate is redissolved in 300 ml of 20 mM Tris-HCI pH 7.5/5 mM 2-mercaptoethanol/0.1 mM EDTA/10% 
glycerol (Buffer A). This is fraction II. 

A column of Whatman DE52 DEAE (12.6 cm 2 x 18 cm) is prepared and washed with Buffer A. Fraction II 

35 is dialyzed overnight against two changes of 1 Lof Buffer A each untD the conductivity of Fraction llhasacon- 
ductivity equal to that of Buffer A containing 100 mM NaCI. Dialyzed Fraction II is applied to the column at a 
flow rate of 100 ml/hr, and washed with 400 ml of Buffer A containing 100 mM Nad. Proteins are eluted with 
a 3.5 L gradient from 100 to 400 mM NaCI in Buffer A at a flow rate of 60 ml/hr. Fractions containing T7 DNA 
polymerase, which elutes at 200 mM NaCI, are pooled. This is fraction III (190 ml). 

40 A column of Whatman P11 phosphocellulose(12.6cm 2 x 12cm) is prepared and washed with 20 mM KP0 4 
pH 7.4/5 mM 2-mercaptoethanol/0.1 mM EDTA/10 % glycerol (Buffer B). Fraction III is diluted 2^fold (380 ml) 
with Buffer B, then applied to the column at a flow rate of 60 ml/hr, and washed with 200 ml of Buffer B containing 
100mM KC1 . Proteins are eluted with a 1.8 L gradient from 100 to 400 mM KCI in Buffer B at a flow rate of 60 
ml/hr. Fractions containing T7 DNA polymerase, which elutes at 300 mM KCI, are pooled. This is fraction IV 

45 (370 ml). 

A column of DEAE-Sephadex® A-50 (4.9 cm 2 x 1 5 cm) is prepared and washed with 20 mM Tris-HCI 7.0/0.1 
mM dithiothreitol/0.1 mM EDTA/10% glycerol (Buffer C). Fraction IV is dialyzed against two changes of 1 L Buf- 
fer C to a final conductivity equal to that of Buffer C containing 100 mM NaCI. Dialyzed fraction IV is applied 
to the column at a flow rate of 40 ml/hr, and washed with 150 ml of Buffer C containing 100 mM NaCI. Proteins 
so are eluted with a 1 L gradient from 1 00 to 300 mM NaCI in Buffer C at a flow rate of 40 ml/hr. Fractions containing 
T7 DNA polymerase, which elutes at 210 mM NaCI, are pooled. This is fraction V (120 ml). 

A column of BioRad HTP hydroxylapatite (4.9 cm 2 x 15 cm) is prepared and washed with 20 mM KP0 4 , 
pH 7.4/10 mM 2-mercaptoethanol/2 mM Na citrate/10% glycerol (Buffer D). Fraction V is dialyzed against two 
changes of 500 ml Buffer D each. Dialyzed fraction V is applied to the column at a flow rate of 30 ml/hr, and 
55 washed with 100 ml of Buffer D. Proteins are eluted with a 900 ml gradient from 0 to 180 mM KP0 4 , pH 7.4 in 
Buffer D at a flow rate of 30 ml/hr. Fractions containing T7 DNA polymerase, which elutes at 50 mM KP0 4 , are 
pooled. This is fraction VI (130 ml). It contains 270 mg of homogeneous T7 DNA polymerase. 

Fraction VI is dialyzed versus 20 mM KPQ 4 pH 7.4/0.1 mM dithiothreitol/0.1 mM EDTA/50% glycerol. This 
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is concentrated fraction VI (-65 ml, 4 mg/ml), and is stored at -20°C. 

The isolated T7 polymerase has exonuclease activity associated with it. As stated above this must be in- 
activated. An example of inactivation by chemical modification follows. 

Concentrated fraction VI is dialyzed overnight against 20 mM KP0 4 pH 7.4/0.1 mM dithiothreitol/10% 

5 glycerol to remove the EDTA present in the storage buffer. After dialysis, the concentration is adjusted to 2 
mg/ml with 20 mM KP0 4 pH 7.4/0.1 mM dithiothreitol/10% glycerol, and 30 ml (2mg/rrtl) aliquots are placed in 
50 ml polypropylene tubes. (At 2 mg/ml, the molar concentration of T7 DNA polymerase is 22 uM.) 

Dithtothreitol (DTT) and ferrous ammonium sulfate (FelNH^SO^ehfeO) are prepared fresh immediately 
before use, and added to a 30 ml aliquot of T7 DNA polymerase, to concentrations of 5 mM DTT (0.6 ml of a 

10 250 mM stock) and 20*iM Fe(NH 4 )2(S0 4 )26H 2 0 (0.6 ml of a 1 mM stock). During modification the molar con- 
centrations of T7 DNA polymerase and iron are each approximately 20 uM, while DTT is in 250X molar excess. 

The modification is carried out at 0°C under a saturated oxygen atmosphere as follows. The reaction mixt- 
ure is placed on ice within a dessicator, the dessicator is purged of air by evacuation and subsequently filled 
with 100% oxygen. This cycle is repeated three times. The reaction can be performed in air (20% oxygen), but 

is occurs at one third the rate. 

The time course of loss of exonuclease activity is shown in Fig. 4. 3 hMabeled double-stranded DNA (6 cprrv 
/pmol) was prepared from bacteriophage T7 as described by Richardson (15 J. Mdec. Biol. 49, 1 966). 3 H-labeled 
single-stranded T7 DNA was prepared immediately prior to use by denaturation of double-stranded 3 H-labeted 
T7 DNA with 50 mM NaOH at 20°C for 15 min, followed by neutralization with HQ. The standard exonuclease 

20 assay used is a modification of the procedure described by Chase et al. (supra) . The standard reaction mixture 
(100 uJ final volume) contained 40 mM Tris/HCi pH 7.5, 10 mM MgCI 2 , 10 mM dithiothreitol, 60 nmol 3 H-labeled 
single-stranded T7 DNA (6 cpm/pm), and varying amounts of T7 DNA polymerase. 3 H-labeled double-stranded 
T7 DNA can also be used as a substrate. Also, any uniformly radioactively labeled DNA, single- or double-stran- 
ded, can be used for the assay. Also, 3' end labeled singleor double-stranded DNA can be used for the assay. 

25 After incubation at 37°C for 15 min, the reaction is stopped by the addition of 30 uJ of BSA (lOmg/ml) and 25 uJ 
of TCA (100% w/v). The assay can be run at I0°C-45°C for 1-60 min. The DNA is precipitated on ice for 15 min 
(I min - 12 hr), then centrifuged at 12,000 g for 30 min (5 min - 3 hr). 100 uJ of the supernatant is used to determine 
the acid-soluble radioactivity by adding it to 400 uJ water and 5 ml of aqueous scintillation cocktail. 

One unit of exonuclease activity catalyzes the acid solubilization of 10 nmol of total nucleotide in 30 min 

30 under the conditions of the assay. Native T7 DNA polymerase has a specific exonuclease activity of 5000 un- 
its/mg, using the standard assay conditions stated above. The specific exonuclease activity of the modified T7 
DNA polymerase depends upon the extent of chemical modification, but ideally is at least 10-100-fold lower than 
that of native T7 DNA polymerase, or 500 to 50 or less units/mg using the standard assay conditions stated 
above. When double stranded substrate is used the exonuclease activity is about 7-fold higher. 

35 Under the conditions outlined, the exonuclease activity decays exponentially, with a half-life of decay of 
eight hours. Once per day the reaction vessel is mixed to distribute the soluble oxygen, otherwise the reaction 
will proceed more rapidly at the surface where the concentration of oxygen is higher. Once per day 2.5 mM 
DTT (0.3 ml of a fresh 250 mM stock to a 30 ml reaction) is added to replenish the oxidized DTT. 

After eight hours, the exonuclease activity of T7 DNA polymerase has been reduced 50%, with negligible 

40 loss of polymerase activity. The 50% loss may be the result of the complete inactivation of exonuclease activity 
of half the polymerase molecules, rather than a general reduction of the rate of exonuclease activity in all the 
molecules. Thus, after an eight hour reaction all the molecules have normal polymerase activity, half the 
molecules have normal exonuclease activity, while the other half have <0.1% of their original exonuclease 
activity. 

45 When 50% of the molecules are modified (an eight hour reaction), the enzyme is suitable, although sub- 
optimal, for DNA sequencing. For more optimum quality of DNA sequencing, the reaction is allowed to proceed 
to greater than 99% modification (having less than 50 units of exonuclease activity), which requires four days. 

After four days, the reaction mixture is dialyzed against 2 changes of 250 ml of 20 mM KP0 4 pH 7.4/0.1 
mM dithiothreitol/0.1 mM EDTA/50% glycerol to remove the iron. The modified T7 DNA polymerase (-4 mg/ml) 

50 is stored at -20°C. 

The reaction mechanism for chemical modification of T7 DNA polymerase depends upon reactive oxygen 
species generated by the presence of reduced transition metals such as Fe 2 * and oxygen. A possible reaction 
mechanism for the generation of hydroxy! radicals is outlined below: 

(1) Fe^ + Oz-^Fe^ + Oi 

(2) 202 + 2H + ->H 2 0 2 + 0 2 
(3) Fe* + H 2 0 2 Fe 3 * + OH- + OH~ 
ln equation 1, oxidation of the reduced metal ion yields superoxide radical, Oj. The superoxide radical can 
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undergo a dismutation reaction, producing hydrogen peroxide (equation 2). Finally, hydrogen peroxide can 
react with reduced metal ions to form hydroxy! radicals, OH- (the Fenton reaction, equation 3). The oxidized 
metal ion is recycled to the reduced form by reducing agents such as dithbthreitol (DTT). 

These reactive oxygen species probably Inactivate proteins by irreversibly chemically altering specific 

5 amino acid residues. Such damage is observed in SDS-PAGE of fragments of gene 5 produced by CNBr or 
trypsin. Some fragments disappear, high molecular weight cross linking occurs, and some fragments are broken 
into two smaller fragments. 

As previously mentioned, oxygen, a reducing agent (e.g. DTT, 2-mercaptoethanol) and a transition metal 
(e.g. iron) are essential elements of the modification reaction. The reaction occurs in air, but is stimulated three- 

10 fold by use of 100% oxygen. The reaction will occur slowly in the absence of added transition metals due to 
the presence of trace quantities of transition metals (1-2nM) in most buffer preparations. 

As expected, inhibitors of the modification reaction include anaerobic conditions (e.g., isy and metal 
chelators (e.g. EDTA, citrate, nitrilotriacetate). In addition, the enzymes catalase and superoxide dismutase 
may inhibit the reaction, consistent with the essential role of reactive oxygen species in the generation of mod- 

15 if led T7 DNA polymerase. 

As an alternative procedure, it is possible to genetically mutate the T7 gene 5 to specifically inactivate the 
exonuclease domain of the protein. The T7 gene 5 protein purified from such mutants is ideal for use in DNA 
sequencing without the need to chemically inactivate the exonuclease by oxidation and without the secondary 
damage that inevitably occurs to the protein during chemical modification. 

20 Genetically modified T7 DNA polymerase can be isolated by randomly mutagenizing the gene 5 and then 
screening for those mutants that have lost exonuclease activity, without loss of polymerase activity. 
Mutagenesis is performed as follows. Single-stranded DNA containing gene 5 (e.g., cloned in pEMBL-8, a plas- 
mid containing an origin for single stranded DNA replication) under the control of aT7 RNA polymerase promoter 
is prepared by standard procedure, and treated with two different chemical mutagens: hydrazine, which will 

25 mutate C's and Ts, and formic acid, which will mutate G's and A's. Myers et al. 229 Science 242, 1985. The 
DNA is mutagenized at a dose which results in an average of one base being altered per plasmid molecule. 
The single-stranded mutagenized plasmids are then primed with a universal 17-mer primer (see above), and 
used as templates to synthesize the opposite strands. The synthesized strands contain randomly incorporated 
bases at positions corresponding to the mutated bases in the templates. The double-stranded mutagenized 

30 DNA is then used to transform the strain K38/pGP1-2, which is strain K38 containing the plasmid pGP1-2 (Tabor 
et al., supra) . Upon heat induction this strain expresses T7 RNA polymerase. The transformed cells are plated 
at 30°C, with approximately 200 colonies per plate. 

Screening for cells having T7 DNA polymerase lacking exonuclease activity is based upon the following 
finding. The 3' to 5' exonuclease of DNA polymerases serves a proofreading function. When bases are mis- 

35 incorporated, the exonuclease will remove the newly incorporated base which is recognized as "abnormal". 
This is the case for the analog of dATP, etheno-dATP, which is readily incorporated by T7 DNA polymerase in 
place of dATP. However, in the presence of the 3' to 5' exonuclease of T7 DNA polymerase, it is excised as 
rapidly as it is incorporated, resulting in no net DNA synthesis. As shown in figure 6, using the alternating 
copolymer poly d(AT) as a template, native T7 DNA polymerase catalyzes extensive DNA synthesis only in the 

40 presence of dATP, and not etheno-dATP. In contrast, modified T7 DNA polymerase, because of its lack of an 
associated exonuclease, stably incorporates etheno-dATP into DNA at a rate comparable to dATP. Thus, using 
poly d(AT) as a template, and dTTP and etheno-dATP as precursors, native T7 DNA polymerase Is unable to 
synthesize DNA from this template, while T7 DNA polymerase which has lost its exonuclease activity will be 
able to use this template to synthesize DNA. 

45 The procedure for lysing and screening large number of colonies is described in Raetz (72 Proc. Nat Acad. 
Sci. 2274, 1975). Briefly, the K38/pGP1-2 cells transformed with the mutagenized gene 5-containing plasmids 
are transferred from the petri dish, where they are present at approximately 200 colonies per plate, to a piece 
of filter paper ("replica plating"). The filter paper discs are then placed at 42°C for 60 min to induce the T7 RNA 
polymerase, which in turn expresses the gene 5 protein. Thioredoxin is constitutively produced from the 

so chromosomal gene. Lysozyme is added to the filter paper to lyse the cells. After a freeze thaw step to ensure 
cell lysis, the filter paper discs are incubated with poly d(AT), [a^PJdTTP and etheno-dATP at 37°C for 60 min. 
The filter paper discs are then washed with acid to remove the unincorporated pPJdATP. DNA will precipitate 
on the filter paper in acid, while nucleotides will be soluble. The washed filter paper is then used to expose 
X-ray film. Colonies which have induced an active T7 DNA polymerase which is deficient in its exonuclease 

55 will have incorporated acid-insoluble *P, and will be visible by autoradiography. Colonies expressing native T7 
DNA polymerase, or expressing a T7 DNA polymerase defective in polymerase activity, will not appear on the 
autoradiograph. 

Colonies which appear positive are recovered from the master petri dish containing the original colonies. 
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Cells containing each potential positive clone will be induced on a larger scale (one liter) and T7 DNA polymer- 
ase purified from each preparation to ascertain the levels of exonuclease associated with each mutant Those 
low in exonuclease are appropriate for DNA sequencing. 

Directed mutagenesis may also be used to isolate genetic mutants in the exonuclease domain of the T7 
5 gene 5 protein. The following is an example of this procedure. 

T7 DNA polymerase with reduced exonuclease activity (modified T7 DNA polymerase) can also be dis- 
tinguished from native T7 DNA polymerase by its ability to synthesize through regions of secondary structure. 
Thus, with modified DNA polymerase. DNA synthesis from a labeled primer on a template having secondary 
structure will result in significantly longer extensions, compared to unmodified or native DNA polymerase. This 
10 assay provides a basis for screening for the conversion of small percentages of DNA polymerase molecules 
to a modified form. 

The above assay was used to screen for altered T7 DNA polymerase after treatment with a number of chem- 
ical reagents. Three reactions resulted in conversion of the enzyme to a modified form. The first is treatment 
with iron and a reducing agent, as described above. The other two involve treatment of the enzyme with photo- 

15 oxidizing dyes, Rose Bengal and methylene blue, in the presence of light The dyes must be titrated carefully, 
and even under optimum conditions the specificity of inactivation of exonuclease activity over polymerase 
activity is low, compared to the high specificity of the iron-induced oxidation. Since these dyes are quite specific 
for modification of histidine residues, this result strongly implicates histidine residues as an essential species 
in the exonuclease active site. 

20 There are 23 histidine residues in T7 gene 5 protein. Eight of these residues lie in the amino half of the 
protein, in the region where, based on the homology with the large fragment of E. coli DNA polymerase I, the 
exonuclease domain may be located (OH is et al. Nature 313, 818. 1984). As described below, seven of the eight 
histidine residues were mutated individually by synthesis of appropriate oligonucleotides, which were then in- 
corporated into gene 5. These correspond to mutants 1, and 6-10 in table 1. 

25 The mutations were constructed by first cloning the T7 gene 5 from pGP5-3 (Tabor et al., J. Biol. Chem. 
282, 1987) into the Smal and Hindlll sites of the vector M13 mp1 8, to give mGP5-2. (The vector used and the 
source of gene 5 are not critical in this procedure.) Single-stranded mGP5-2 DNA was prepared from a strain 
that incorporates deoxyuracil in place of deoxythymidine (Kunkel, Proc. Natl. Acad. Sci. USA 82, 488, 1985). 
This procedure provides a strong selection for survival of only the synthesized strand (that containing the mu- 

30 tation) when transfected into wild-type E.coH, since the strand containing uracil will be preferentially degraded. 
Mutant oligonucleotides, 15-20 bases in length, were synthesized by standard procedures. Each oligonuc- 
leotide was annealed to the template, extended using native T7 DNA polymerase, and ligated using T4 DNA 
(igase. Covalen tiy closed circular molecules were isolated by agarose gel electrophoresis, run in the presence 
of 0.5ug/ml ethidium bromide. The resulting purified molecules were then used to transform E. coli 71 . 1 8. DNA 

35 from the resulting plaques was isolated and the relevant region sequenced to confirm each mutation. 

The following summarizes the oligonucleotides used to generate genetic mutants in the gene 5 exonuc- 
lease. The mutations created are underlined. Amino acid and base pair numbers are taken from Dunn et al., 
166 J. Molec Biol. 477, 1983. The relevant wild type sequences of the region of gene 5 mutated are also shown. 

40 Wild type sequence: 

1 9.? <«> 122 123 - 

Leu Leu Arg- Ser Gly Lys Leu Pro Gly Lys Arg Phe Gly Ser Bis Ala leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG TCT CAC OCT TTG GAG 
^ 14677 (T7 bp) 

Mutation 1: His 123 -* Ser 123 

P riser used: 5 1 CGC TTT GGA TC£ GCT TTG 3' 

50 

Mutant sequence: 

123 

Leu Leu Arg Ser Gly Lys Leu Pro Gly Lys Arg Phe Gly Ser Ale Leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GG& TCC Z£C GCT TTG GAG 

55 



10 



EP 0 386 857 B1 



Mutation 2: Deletion of Ser 122 and His 123 
Primer used: 5 f GGA AAA CSC TTT GGC GCC TTG GAG GCG 3' 
5 6 base deletion 

Mutant sequence: 

122 123 

leu leu Arg Ser Gly Lys Leu Pro Cly Lys Arg p.He Ciy Ala Leu Glu 

f0 CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA C3C TTT GGC GCC TTG GAG 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Mutation 3: Set 122, His 123 Ala 122, Glu 123 
Primer used: 5' CGC TTT GGG fiCT GAG GCT TTG G 3« 



Mutant sequence: 

122 123 

Leu Leu Arg Ser Gly Lys Leu Pro Gly Lys Arg Phe Gly Ala £lu AJa Leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG OCT GAG GCT TTG GAG 

Mutation 4: Lys 118, Arg 119 -> Glu 118, Glu 119 

Primer used: 5 • 5' G CCC GGG GAA G&G TTT GGG TCT CAC GC 3' 



Mutant sequence: ^ ^ 

Leu Leu Arg Ser Gly Lys Leu Pro Gly Gill 21a Phe Gly Ser His Ale Leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGfi fiAA SAG TTT GGG TCT CAC GCT TTG GAG 



Mutation 5:. Arg 111, Ser 112, Lys 114 - Glu 111, Ala 112, Glu 114 
Primer used : 5« G GGT CTT CTG GZ* GQC GGC GAG TTG CCC GG 3' 



Mutant sequence: 

Leu Leu £Li AU_GIy £lx *eu Pro Gly Lys Arg Phe Gly Ser Bis Ala Leu 
CTT CTG GM GCC GGC GAG TTG CCC GGA AAA CGC TTT GGG TCT CAC GCT TTG GAG 



Mutation 6: His 59, His 62 Ser 59, Ser 62 

Primer used: 5' ATT GTG TTC 2£C AAC GG& ICC AAG TAT GAC G 3* 
Wild-type sequence: 

Leu He val Phe His Asn Gly His lys Tyr Asp Val 
CTT ATT GTG TTC CAC AAC GGT CAC AAG TAT GAC GTT 
T7 bp: 14515 
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Mutant sequence: 

59 62 
Leu lie Val Phe Asn Gly Ser Lys Tyr Asp Val 
CTT ATT GTG TTC I£C AAC GGA 2£C AAG TAT GAC GTT 



Mutation 7: His 82 -+ Ser 62 

Primer used: 5' GAG TTC ZCC CTT CCT CG 3' 

Wild-type sequence: 

aa: 77 82 

Leu Asn Arg Glu Phe His Leu Pro Arg Glu Asn 
TTG AAC CGA GAG TTC CAC CTT CCT CGT GAG AAC 
T7 bp: X4S81 



Mutant sequence: 

82 

20 

Leu Asn Argr Glu Phe Ser Leu Pro Arg Glu Asn 
TTG AAC CGA GAG TTC 2QC CTT CCT CGT GAG AAC 



Mutation 8: Arg 96, His 99 -4 Leu 96, Ser 99 

Primer used: 5' C2S TTG ATT 2£T TCC AAC CTC 3* 

Wild-type sequence: 

aa: 93 96 99 

Val Leu Ser Arg Leu lie Mis Ser Asn Leu Lys Asp Thr Asp 
GTG TTG TCA CGT TTG ATT CAT TCC AAC CTC AAG GAC ACC GAT 
T7 bp.: 14629 



35 

Mutant sequence: 

96 99 

Val Leu Ser Lmu Leu lie Ser Ser Asn Leu Lys Asp Thr Asp 
GTG TTG TCA CZS TTG ATT 2CT TCC AAC CTC AAG GAC ACC GAT 



40 



45 



Mutation 9: His 190 Ser 190 

Primer used: 5' CT GAC AAA 2£T TAC TTC CCT 3« 

Wild-type sequence: 



aa: 185 190 

Leu Leu Ser Asp Lys His Tyr Phe Pro Pro Glu 
CTA CTC TCT GAC AAA CAT TAC TTC CCT CCT GAG 
50 T7 bp: 14905 

Mutant seouence: 

190 

Leu Leu Ser Asp Lys Ser Tyr Phe Pro Pro Glu 
CTA CTC TCT GAC AAA 2£T TAC TTC CCT CCT GAG 
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Mutation 10 S Hi* - Ser 218 

Pri „r used: 5' ^ ATT OA V CB GCT GC 
Wild-type sequence: 

a*: 214 . 1)1 M ja a rrp Leu Leu 

T7 bp: 14992 



Mutant sequence: 



Mutation 11': Deletion oC amino acids 118 to 123 

?r£aar usen: 5' C <SGC AAG TTG CCC GS9 OCT TTG GAS GCG TG6 G 3' 

A 

18 base deletion 



25 Wild-type sequence: 

109 <aai 122 123 126 

Leu Leu Arg Ser Gly Lys Leu Pro Gly Lys Ary Pfte Gly Ser Sis Ala Leu Glu 
CTT CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG TCT CAC GCT TTG GAG 
14 677 (T7 bp) 



Mutant sequence: 124 

Leu Leu Ar* Ser Gly Lys Xeu Pro Gly t* amino acid*> Ala Leu Glu 

CTT CTG CGT TCC GGC AAG TTG CCC GGG (18 bases) GCT TTu GAG 

Mutation 12: His 123 «*Glu 123 

Friaer used: 5 1 GGG TCT GAG GCT TTG G 3« 

Mutant sequence: 

123 

Leu Leu Ary Ser Gly Lys Leu Pro Gly Lys Ary Phe Gly Ser Gia Ala Leu Glu 
C-T CTG CGT TCC GGC AAG TTG CCC GGA AAA CGC TTT GGG TCT GAG GCT TTG GAG 



Mutation 1 3 : (Arg 131, Lys 136, Lys 140. Lys 144, Arg 145 

Glu 131. Glu 136, Glu 140, Glu 144, Glu 145) 



Primer used: 5* GCT TAT G&fi GC GGC GAG ATG GAG GGT GAA TAC GAA GAC GAC TTT GAG GAA ATS 
CTT GAA G 3 ■ 
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Wild-type sequence: 

129 Ua) 131 136 140 144 145 

Gly Tyr Arg Leu GXy Glu Met Lys Gly Glu Tyr Lys Asp Asp Phe Lya Ary Mat Leu Glu Glu 
GGT TAX CGC TTA GGC GAG ATG AAG GGT GAA TAC AAA GAC GAC TTT AAG CGT ATG C7T GAA G 
14737 <T7 bp) 



Mutant sequence: 

129 Ul) 131 136 140 144 145 

Gly Tyr £lu Leu Gly Glu Met Glu Gly Glu Tyr GJji Asp Asp Phe Q2n CJjl Met Leu Glu Glu 

CGT TAT fiifi dC GGC GAG ATG GAG GGT GAA TAS OAA GAC GAC TTT GAG &&& ATG CTT GAA G 
14737 (T7 bp) 



Each mutant gene 5 protein was produced by infection of the mutant phage into K38/pGP1-2, as follows. 
The cells were grown at 30°C to an A^^I.O. The temperature was shifted to 42°C for 30 min., to induce T7 
20 RNA polymerase. IPTG was added to 0.5 mM, and a lysate of each phage was added at a moi=10. Infected 
cells were grown at 37°C for 90 min. The cells were then harvested and extracts prepared by standard pro- 
cedures for 17 gene 5 protein. 

Extracts were partially purified by passage over a phosphocellulose and DEAEA-50 column, and assayed 
by measuring the polymerase and exonuclease activities directly, as described above. The results are shown 
25 in Table 1. 



30 



35 



Table 1 

SUMMARY OF EXONUCLEASE AND POLYMERASE 
ACTIVITIES OF T7 GENE 5 MUTANTS 

Exonuclease Polymerase 
Mutant activity, % activity, \ 



[Wad-type] 1100]* [100]b 
Mutant 1 

(His 123 -* Set 123) 10-25 >90 
Mutant 2 

(A Ser 122, His 123) 0.2-0.4 >90 
Muiant 3 

(Serl22.Hisl23->Alal2ZGlul23) <2 >90 
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Table 1 

SUMMARY OF EXONUCLEASE AMD POLYMERASE 
ACTIVITIES OF T7 GENE 5 MUTANTS 

Exonuclease Polymerase 
Mutant activity, \ activity, % 

Mutant 4 

(Lys 1 18. Arg 1 19-* Glu 118, Glu 119) <30 >90 

Mutant 5 

1S (Arg 1 1 l t Scr 1 12, Lys 114 -> 

Glulll,Alall2,Glull4) >75 >90 

Mutant 6 

(His 59, His 62 -» Scr 59. Scr 62) >75 >90 

20 Mutant7 

(His 82 -» Scr 82) >75 >90 

Mutant 8 

(Arg 96, His 99 -* Leu 96, Scr 99) >75 >90 

25 Mutant9 

(His 190 Set 190) >75 >90 
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Mutant 10 

(His 218 -*Ser 218) >75 >90 

Mutant 11 
(A Lys 118, Arg 119, Phe 120, 

Gly 121, Scr 122, His 123) <0.02 >90 

Mutant 12 

35 (His 123 Glu 123) . <30 >90 

Mutant 13 
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45 



(Arg 131, Lys 136, Lys 140, Lys 144, Arg 145 

Glu 131, Glu 136, Glu 140, Glu 144, Glu 145) <30 >90 

^fiJIf? 1 "™ a « ivi * v w *s measured on single stranded [3 H ] T7 
10 °* exonuclease activity corresponds to 5,000 unirs/ng 

nria ^EX'SI? accivit y measured using single-stranded calf thymus 
100% Polymerase activity corresponds to 8,000 units/mg. ^ 



50 



Of the seven histidines tested, only one (His 123: mutant 1) has the enzymatic activities characteristic of 
modified T7 DNA polymerase. T7 gene 5 protein was purified from this mutant using DEAE-cellulose, phos- 
55 phocellulose, DEAE-Sephadex and hydroxylapatite chromatography. While the polymerase activity was nearly 
normal (>90% the level of the native enzyme), the exonuclease activity was reduced 4 to 10-fold. 

A variant of this mutant was constructed in which both His 1 23 and Ser 1 22 were deleted. The gene 5 protein 
purified from this mutant has a 200-500 fold lower exonuclease activity, again with retention of >90% of the 
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polymerase activity. 

These data strongly suggest that His 123 lies in the active site of the exonuclease domain of T7 gene 5 
protein. Furthermore, it is likely that the His 123 is in fact the residue being modified by the oxidation involving 
iron, oxygen and a reducing agent, since such oxidation has been shown to modify histidine residues in other 
5 proteins (Levine, J. Biol. Chem. 258: 11823, 1983; and Hodgson et al. Biochemistry 14: 5294, 1975). The level 
of residual exonuclease in mutant II is comparable to the levels obtainable by chemical modification. 

Although mutations at His residues are described, mutations at nearby sites or even at distant sites may 
also produce mutant enzymes suitable in this invention, e.g.. lys and arg (mutants 4 and 1 5). Similarly, although 
mutations in some His residues have little effect on exonuclease activity that does not necessarily indicate that 
10 mutations near these residues will not affect exonuclease activity. 

Mutations which are especially effective include those having deletions of 2 or more amino acids, preferably 
6-8, for example, near the His-1 23 region. Other mutations should reduce exonuclease activity further, or com- 
pletely. 

As an example of the use of these mutant strains the following is illustrative. A pGP5-6 (mutation ll)-con- 
15 taining strain has been deposited with the ATCC (see below). The strain is grown as described above and in- 
duced as described in Taber et al. J. Biol. Chem. 262:16212 (1987). K38/pTrx-2 cells may be added to increase 
the yield of genetically modified T7 DNA polymerase. 

The above noted deposited strain also contains plasmid pGPI-2 which expresses T7 RNA polymerase. This 
plasmid is described in Tabor et al., Proc. Nat Acad. Sci. USA 82:1074, 1985 and was deposited with the ATCC 
20 on March 22, 1985 and assigned the number 40,175. 

Referring to Fig. 10, pGP5-6 includes the following segments: 

1. EcoRl-Sacl- Sma l -Bam HI polylinker sequence from MI3 mpIO (2lbp). 

2. T7 bp I4309 to I6747, that contains the T7 gene 5, with the following modifications: 

T7 bp I4703 is changed from an A to a G, creating a Smal site. 
25 T7 bp I4304 to I432I inclusive are deleted (18 bp). 

3. Sall-Pstl-Hindlll polylinker sequence from MI3 mp 10 (15 bp) 

4. pBR322 bp 29 (Hindlll site) to pBR322 bp 375 (BamHI site). 

5. T7 bp 22855 to T7 bp 22927, that contains the T7 RNA Polymerase promoter $10, with Bam HI linkers 
inserted at each end (82 bp). 

30 6. pBR322 bp 375 (Bam HI site) to pBR322 bp 4361 (EcoRI site). 

DNA Sequencing Using Modif ied T7-type DNA Polymerase 

DNA synthesis reactions using modified T7-type DNA polymerase result in chain-terminated fragments of 

35 uniform radioactive intensity, throughout the range of several bases to thousands of bases in length. There is 
virtually no background due to terminations at sites independent of chain terminating agent incorporation (i.e. 
at pause sites or secondary structure impediments). 

Sequencing reactions using modified T7-type DNA polymerase consist of a pulse and chase. By pulse is 
meant that a short labelled DNA fragment is synthesized; by chase is meant that the short fragment is 

40 lengthened until a chain terminating agent is incorporated. The rationale for each step differs from conventional 
DNA sequencing reactions. In the pulse, the reaction is incubated at 0°C-37°C for 0.5-4 min in the presence 
of high levels of three nucleotide triphosphates (e.g., dGTP, dCTP and dTTP) and limiting levels of one other 
labelled, carrier-free, nucleotide triphosphate, e.g., P 6 S] dATP. Under these conditions the modified polymerase 
is unable to exhibit its processive character, and a population of radioactive fragments will be synthesized rang- 

45 ing in size from a few bases to several hundred bases. The purpose of the pulse is to radioactively label each 
primer, incorporating maximal radioactivity while using minimal levels of radioactive nucleotides. In this 
example, two conditions in the pulse reaction (low temperature, e.g., from 0-20°C, and limiting levels of dATP, 
e.g., from O.luM to luM) prevent the modified T7-type DNA polymerase from exhibiting its processive character. 
Other essential environmental components of the mixture will have similar effects, e.g., limiting more than one 

so nucleotide triphosphate or increasing the ionic strength of the reaction. If the primer is already labelled (e.g., 
by kinasing) no pulse step is required. 

In the chase, the reaction is incubated at 45°C for 1-30 min in the presence of high levels (50-500nM) of 
all four deoxynucleoside triphosphates and limiting levels (1-50uM) of any one of the four chain terminating 
agents, e.g., dideoxynucleoside triphosphates, such that DNA synthesis is terminated after an average of 50- 

55 600 bases. The purpose of the chase is to extend each radioactively labeled primer under conditions of pro- 
cessive DNA synthesis, terminating each extension exclusively at correct sites in four separate reactions using 
each of the four dideoxynucleoside triphosphates. Two conditions of the chase (high temperature, e.g., from 
30-50°C) and high levels (above 50uM) of all four deoxynucleoside triphosphates) allow the modified T7-type 
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DNA polymerase to exhibit its processive character for tens of thousands of bases; thus the same polymerase 
molecule will synthesize from the primer-template until a dideoxynucleotide is incorporated. At a chase tem- 
perature of 45°C synthesis occurs at >700 nucleotides/sec. Thus, for sequencing reactions the chase is com- 
plete in less than a second, ssb increases processivity, for example, when using dITP, or when using low 
5 temperatures or high ionic strength, or low levels of triphosphates throughout the sequencing reaction. 

Either [a^SJdATRfa^PJdATP or fluorescently labelled nucleotides can be used in the DNA sequencing 
reactions with modified T7-type DNA polymerase. If the fluorescent analog is at the 5' end of the primer, then 
no pulse step is required. 

Two components determine the average extensions of the synthesis reactions. First is the length of time 
10 of the pulse reaction. Since the pulse is done in the absence of chain terminating agents, the longer the pulse 
reaction time, the longer the primer extensions. At 0°C the polymerase extensions average 1 0 nucleotides/sec. 
Second is the ratio of deoxyribonucieoside triphosphates to chain terminating agents in the chase reaction. A 
modified T7-type DNA polymerase does not discriminate against the incorporation of these analogs, thus the 
average length of extension in the chase is four times the ratio of the deoxynucleoside triphosphate concen- 
ts tration to the chain terminating agent concentration in the chase reaction. Thus, in order to shorten the average 
size of the extensions, the pulse time is shortened, e.g., to 30 sec. and/or the ratio of chain terminating agent 
to deoxynucleoside triphosphate concentration is raised in the chase reaction. This can be done either by rais- 
ing the concentration of the chain terminating agent or lowering the concentration of deoxynucleoside triphos- 
phate. To increase the average length of the extensions, the pulse time is increased, e.g., to 3-4 min, and/or 
20 the concentration of chain terminating agent is lowered (e.g., from 20u.M to 2uM) in the chase reaction. 

Example 2: DNA sequencing using modified T7 DNA polymerase 

The following is an example of a sequencing protocol using dideoxy nucleotides as terminating agents. 

25 9uJ of single-stranded M13 DNA (mGP1-2, prepared by standard procedures) at 0.7 mM concentration is 
mixed with 1 uJ of complementary sequencing primer (standard universal 17-mer, 0.5 pmole primer / uJ) and 
2.5 til 5X annealing buffer (200 mM Tris-HQ, pH 7.5, 50 mM MgCy heated to 65°C for 3 min, and slow cooled 
to room temperature over 30 min. In the pulse reaction, 12.5 ul of the above annealed mix was mixed with 1 
ul dithiothreitol 0.1 M, 2 uJ of 3 dNTPs (dGTP, dCTP, dTTP) 3 mM each (P.L Biochemicals, in TE), 2.5 ul 

30 [a 35 S]dATP, (1500 Ci/mmol, New England Nuclear) and 1 ul of modified T7 DNA polymerase described in 
Example 1 (0.4 mg/ml, 2500 units/ml, i.e. 0.4 ug, 2.5 units) and incubated at 0°C, for 2 min, after vortexing and 
centrifuging in a microfuge for 1 sec. The time of incubation can vary from 30 sec to 20 min and temperature 
can vary from 0°C to 37°C. Longer times are used for determining sequences distant from the primer. 

4.5 uJ aliquots of the above pulse reaction are added to each of four tubes containing the chase mixes, 

35 preheated to 45°C. The four tubes, labeled G, A, T, C, each contain trace amounts of either dideoxy (dd) G, A, 
T, or C (P-L Biochemicals). The specific chase solutions are given below. Each tube contains 1 .5 uJ dATP 1 mM, 
0.5 ul 5X annealing buffer (200 mM Tris-HCI, pH 7.5, 50mM MgCIJ, and 1.0 ui ddNTP 100 uM (where ddNTP 
corresponds to ddG AT or C in the respective tubes). Each chase reaction is incubated at 45°C (or 30°C-50°C) 
for 10 min, and then 6 ul of stop solution (90% formamide, 10mM EDTA, 0.1% xytenecyanol) is added to each 

40 tube, and the tube placed on ice. The chase times can vary from 1-30 min. 

The sequencing reactions are run on standard, 6% polyacrylamide sequencing gel in 7M urea, at 30 Watts 
for 6 hours. Prior to running on a gel the reactions are heated to 75°C for 2 min. The gel is f ixed in 10% acetic 
acid, 1 0% methanol, dried on a gel dryer, and exposed to Kodak OM1 high-contrast autoradiography film over- 
night. 

45 

Example 3: DNA sequencing using limiting concentrations of dNTPs 

In this example DNAsequence analysis of mGPI-2 DNA is performed using limiting levels of all four deoxyri- 
bonucieoside triphosphates in the pulse reaction. This method has a number of advantages over the protocol 

50 in example 2. First, the pulse reaction runs to completion, whereas in the previous protocol it was necessary 
to interrupt a time course. As a consequence the reactions are easier to run. Second, with this method it is 
easier to control the extent of the elongations in the pulse, and so the efficiency of labeling of sequences near 
the primer (the first 50 bases) is increased approximately I0-fold. 

7 ul of 0.75 mM single-stranded Ml 3 DNA (mGPI-2) was mixed with lul of complementary sequencing primer 

55 (17-mer, 0.5 pmole primer/uJ) and 2 uJ 5X annealing buffer (200 mM Tris-HCI pH 7.5, 50 mM MgCI 2 , 250 mM 
NaCI) heated at 65°C for 2 min, and slowly cooled to room temperature over 30 min. In the pulse reaction 10 
uJ of the above annealed mix was mixed with I uJ dithiothreitoi 0.I M, 2 ui of 3 dNTPS (dGTP, dCTP, dTTP) I.5 
uM each, 0.5 ul [a^SJdATP, (alOuM) (about IOuM, I500 Ci/mmol, New England Nuclear) and 2 uJ modified T7 
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DNA polymerase (O.I mg/ml, I000 units/ml, i.e., 0.2 ug, 2 units) and incubated at 37°C for 5 min. (The tempera- 
ture and time of incubation can be varied from 20°C-45°C and I-60 min., respectively.) 

3.5 ui aliquots of the above pulse reaction were added to each of four tubes containing the chase mixes, 
which were preheated to 37°C. The four tubes, labeled G, A, T, C, each contain trace amounts of either dideoxy 

5 G, A, T, C. The specific chase solutions are given below. Each tube contains 0.5 uJ 5X annealing buffer (200 
mM Tris-HCI pH 7.5, 50 mM MgCi 2 , 250 mM Nad), I uJ 4dNTPS (dGTP, dATP, dTTP, dCTP) 200 uM each, and 
I.O ui ddNTP 20 uM. Each chase reaction is incubated at 37°C for 5 min (or 20°C-45°C and I-60 min respect- 
ively), and then 4 uJ of a stop solution (95% fbrmamide, 20 mM EDTA, 0.05% xylene-cyanoi) added to each 
tube, and the tube placed on ice prior to running on a standard polyacrylamide sequencing gel as described 

10 above. 

Example 4: Replacement of dGTP with dITP for DNA sequencing 

In order to sequence through regions of compression in DNA, i.e., regions having compact secondary struc- 
15 ture, it is common to use dITP (Mills et al., 76 Proc. Natl. Acad. Sci. 2232, 1979) or deazaguanosine triphosphate 

(deaza GTP, Mizusawa et al., 14 Nuc. Acid Res. I319, 1986). We have found that both analogs function well with 

T7-type polymerases, especially with dITP in the presence of ssb. Preferably these reactions are performed 

with the above described genetically modified T7 polymerase, or the chase reaction is for I-2 min., and/or at 

20°C to reduce exonuclease degradation. 
20 Modified T7 DNA polymerase efficiently utilizes dITP or deaza-GTP in place of dGTP. dITP is substituted 

for dGTP in both the pulse and chase mixes at a concentration two to five times that at which dGTP is used. 

In the ddG chase mix ddGTP is still used (not ddlTP). 

The chase reactions using dITP are sensitive to the residual low levels (about 0.01 units) of exonuclease 

activity. To avoid this problem, the chase reaction times should not exceed 5 min when dITP is used. It is re- 
25 commended that the four dITP reactions be run in conjunction with, rather than to the exclusion of, the four 

reactions using dGTP. If both dGTP and dITP are routinely used, the number of required mixes can be minimized 

by: (I) Leaving dGTP and dITP out of the chase mixes, which means that the four chase mixes can be used 

for both dGTP and dITP chase reactions. (2) Adding a high concentration of dGTP or dITP (2uJ at 0.5 mM and 

I-2.5 mM respectively) to the appropriate pulse mix. The two pulse mixes then each contain a low concentration 
30 of dCTP.dTTP and [a^SJdATP, and a high concentration of either dGTP or dITP. This modification does not 

usually adversely effect the quality of the sequencing reactions, and reduces the required number of pulse and 

chase mixes to run reactions using both dGTP and dITP to six. 

The sequencing reaction is as for example 3, except that two of the pulse mixes contain a) 3 dNTP mix for 

dGTP: I.5 uM dCTP.dTTP, and I mM dGTP and b) 3 dNTP mix for dITP: I.5 uM dCTP.dTTP, and 2 mM dITP. In 
35 the chase reaction dGTP is removed from the chase mixes (i.e. the chase mixes contain 30 u.M dATP,dTTP 

and dCTP, and one of the four dideoxynucieotides at 8 uM), and the chase time using dITP does not exceed 

5 min. 

Deposits 

40 

Strains K38/pGP5-5/pTrx-2, K38/pTrx-2 and M13 mGP1-2 have been deposited with the ATCC and assig- 
ned numbers 67,287, 67,286, and 40,303 respectively. These deposits were made on January 1 3, 1 987. Strain 
K38/pGPl-27pGP5-6 was deposited with the ATCC. On December 4, 1987, and assigned the number 67571 . 
Applicants' and their assignees acknowledge their responsibility to replace these cultures should they die 
45 before the end of the term of a patent issued hereon, 5 years after the last request for a culture, or 30 years, 
whichever is the longer, and its responsibility to notify the depository of the issuance of such a patent, at which 
time the deposits will be made irrevocably available to the public. Until that time the deposits will be made ir- 
revocably available to the Commissioner of Patents under the terms of 37 CFR Section 1-14 and 35 USC Sec- 
tion 112. 

50 

Other Embodiments 

Other embodiments are within the following claims. 

The direct enzymatic amplification of genomic DNAsequences has been described, for other polymerases, 
55 by Saiki et al., 230 Science 1350, 1985; and Scharf, 233 Science 1076, 1986. 

Referring to Fig. 6, enzymatic amplification of a specific DNA region entails the use of two primers which 
anneal to opposite strands of a double stranded DNA sequence in the region of interest, with their 3' ends di- 
rected toward one another (see dark arrows). The actual procedure involves multiple (10-40, preferably 16-20) 
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cycles of denatu ration, annealing, and DNA synthesis. Using this procedure it is possible to amplify a specific 
region of human genomic DNA over 200,000 times. As a result the specific gene fragment represents about 
one part in five, rather than the initial one part in a million. This greatly facilitates both the cloning and the direct 
analysis of genomic DNA. For diagnostic uses, it can speed up the analysis from several weeks to 1-2 days. 

5 Unlike Klenow fragment, where the amplification process is limited to fragments under two hundred bases 

in length, modified T7-type DNA polymerases should (preferably in conjuction with E. cojj DNA binding protein, 
or ssb, to prevent "snapback formation of single stranded DNA) permit the amplification of DNA fragments 
thousands of bases in length. 

The modified T7-type DNA polymerases are also suitable in standard reaction mixtures: for a) filling in 5' 

10 protruding termini of DNA fragments generated by restriction enzyme cleavage; in order to, for example, pro- 
duce blunt-ended double stranded DNA from a linear DNA molecule having a single stranded region with no 
3' protruding termini; b) for labeling the 3' termini of restriction fragments, for mapping mRNA start sites by SI 
nuclease analysis, or sequencing DNA using the Maxam and Gilbert chemical modification procedure; and c) 
for in vitro mutagenesis of cloned DNA fragments. For example, a chemically synthesized primer which contains 

15 specific mismatched bases is hybridized to a DNA template, and then extended by the modified T7-type DNA 
polymerase. In this way the mutation becomes permanently incorporated into the synthesized strand. It is 
advantageous for the polymerase to synthesize from the primer through the entire length of the DNA. This is 
most efficiently done using a processive DNA polymerase. Alternatively mutagenesis is performed by misin- 
corporation during DNAsynthesis (see above). This application is used to mutagenize specific regions of cloned 

20 DNA fragments. It is important that the enzyme used lack exonuclease activity. By standard reaction mixture 
is meant a buffered solution containing the polymerase and any necessary deoxynucleosides, or other com- 
pounds. 



25 Claims 

1. A method of amplification of a DNA sequence comprising annealing a first and second primer to opposite 
strands of a double stranded DNA sequence and incubating the annealed mixture with a DNA polymerase 
characterized in that said polymerase is a processive bacteriophage T7-type DNA polymerase, having less 

30 than 50% of the exonuclease activity of the naturally associated level of exonuclease activity of said 

polymerase. 

2. A method as claimed in claim 1 further characterized in that said DNA polymerase has less than 500 units 
of exonuclease activity per mg of polymerase, and in that said first and second primers anneal to opposite 

35 strands of said DNA sequence with their 3' ends directed toward each other after annealing, and with the 

DNA sequence to be amplified located between the two annealed primers. 

3. A method as claimed in claim 1 or 2 further characterized in that said polymerase possesses sufficient 
processivity to remain bound to said DNA sequence for at least 500 bases before dissociating. 

40 4. A method as claimed in claim 1 , 2 or 3 further characterized in that said polymerase has less than 1 % of 
the exonuclease activity naturally associated with said polymerase. 

5. A method as claimed in any of the preceding claims in which said polymerase is T7 DNA polymerase. 

45 

Paten tansprQche 

1. Verfahren zum Amplif izieren einer DNA-Sequenz, bei dem an gegenuberliegenden Strangen einer zwei- 
strangigen DNA-Sequenz ein erster und ein zweiter Primer angelagert werden und die angelagerte Mi- 

50 schung mit einer DNA-Polymerase inkubiert wird, dadurch gekennzeichnet, daB die Polymerase eine 

prozessive, T7 DNA-Polymerase von Bakteorophagen T7 ist, die weniger als 50% der Exonucleaseakti- 
vitSt auf weist, als es dem naturlichen MaB der Excnucleaseaktivitfit dieser Polymerase entspricht 

2. Verfahren nach Anspruch 1, dadurch gekennzeichnet, daft die DNA-Polymerase weniger als 500 Einheh 
ten der ExonucleaseaktivitSt pro mg Polymerase auf weist und daft der an die gegenuberliegenden Stran- 
ge der DNA-Sequenz angelagerte erste und zweite Primer mit ihren 3'-Enden nach dem Anlagern 
einander entgegengerichtet sind, wobei sich die zu amplif izierende DNA-Sequenz zwischen den beiden 
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angelagerten Primern bef indet 

3. Verfahren nach Anspruch 1 oder 2, dadurch gekennzeichnet, dad die Polymerase eine ausreichende Re- 
aktionsfolge besitzt um an der DNA-Sequenz wenigstens 500 Basen lang gebunden bleibt, ehe sie dis- 
soziiert. 

4. Verfahren nach den Anspruchen 1 , 2 oder 3 f dadurch gekennzeichnet, daB die Polymerase weniger als 
1 % der ExonucleaseaktivitSt aufweist, die die Polymerase naturlicherweise hat 

5. Verfahren nach einem der vorhergehenden Anspruche, bei dem die Polymerase eine T7 DNA-Polymerase 
ist. 



Revindications 

1. Procede d'amplification d'une sequence d'ADN, comprenant rappariement d'une premiere et d'une 
deuxieme amorces a des brins opposes d'une sequence d'ADN bicatenaire, et I'incubation du melange 
apparie, avec une ADN polymerase, caracterise en ce que cette polymerase est u ne ADN polymerase bac- 
teriophagique progressive de type T7, ayant moins de 50 % de I'acthvite d'exonuclease naturelle associee 
a cette polymerase. 



2. Procede selon la revendication 1, caracterise en outre en ce que ladite ADN polymerase a une activite 
d'exonuclease de moins de 500 unites par mg de polymerase, et en ce que les premiere et deuxieme amor- 
ces s'apparient a des brins opposes de ladite sequence d'ADN, leurs extremites 3' etant orientees en di- 
rection I'un de r autre apres ap pariement, et la sequence d'ADN destinee a etre amplifiee etant situee entre 

25 les deux amorces appariees. 

3. Procede selon la revendication 1 ou 2, caracterise en outre en ce que cette polymerase a une capacite 
de progression suff isante pour rester liee a la sequence d'ADN sur au moins 500 bases avant de se dis- 
socier. 

30 

4. Procede selon la revendication 1, 2 ou 3, caracterise en outre en ce que ladite polymerase a moins de 1 
% de i'activite d'exonuclease naturelle associee a cette polymerase. 

5. Procede selon i'une quelconque des revendications precedentes, dans lequel ladite polymerase est une 
35 ADN polymerase T7. 



40 
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FIGURE 1 
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FIGURE 2 
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FIGUKE 3 
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FIGURE 4 
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FIGURE 5 
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FIGUHS 7 
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AATCACTGCA 


TAATTCGTGT 
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AACGGTTCTG 


GCAAATATTC 


TGAAATGAGC 


210 


220 


230 


240 


2S0 


TGTTGACAAT 


TAATCATCGG 


CTCGTATAAT 


GTGTGGAATT 


GTGAGCGGAT 


260 


270 


280 


290 


300 


AACAATTTCA 


CACAGGAAAC 


AGGGGATCCG 


TCAACCTTTA 


GTTGGTTAAT 


310 


320 


330 


340 


350 


GTTACACCAA 


CAACGAAACC 


AACACGCCAG 


GCTTATTCCT 


GTGGAGTTAT 


360 


370 


380 


390 


400 


ATATGAGCGA 


TAAAATTATT 


CACCTGACTG 


ACGACAGTTT 


TGACACGGAT ' 


410 


420 


430 


440 


450 


GTACTCAAAG 


CGGACGGGGC 


GATCCTCGTC 


GATTTCTGGG 


CAGAGTGGTG 


4 60 


470 


480 


490 


500 


CGG7CCGTGC 


AAGATGATCG 


CCCCGATTCT 


GGATGAAATC 


GCTGACGAAT 



27 



EP 0 386 657 B1 



FIGuSE 7 (continued) 

510 520 530 540 550 

A7CAGGGCAA ACTGACCG77 GCAAAAC7GA ACA7CGA7CA AAACCC7GG7 

560 570 5S0 590 600 

AC7GCGCCGA AATATGGCAT CCG7GG7A7C CCGAC7C7GC TGCTGTTCAA 

610 620 630 640 650 

AAACGGTGAA GTGGCGGCAA CCAAAGTGGG TGCACTGTCT AAAGGTCAG? 

660 670 680 690 700 

TGAAAGAGTT CC7CGACGCT AACCTGGCGT AAGGGAATTT CATGT7CGGG 

710 720 730 740 750 

TGCCCCGTCG CTAAAAACTG GACGCCCGGC GTGAGTCATG C7AAC77AG7 

760 770 780 790 800 

GTTGACGGAT CCCCGGGGA7 CCGTCAACCT TTAGTTGGTT AA7G77ACAC 

810 820 830 840 850 

CAACAACGAA ACCAACACGC CAGGCTTATT CCTGTGGAGT 7A7A7A7GAG 

860 870 880 890 900 

CGATAAAATT A77CACC7GA CTGACGACAG TTTTGACACG GATG'i'ACTCA 
910 920 930 940 950 

AAGCGGACGG GGCGATCCTC G7CGA777C7 GGGCAGAGTG GTGCGGTCCG 
960 970 980 990 1000 

TGCAAGATGA TCGCCCCGAT . TCTGGATGAA ATCGCTGACG AATATCAGGG 

1010 1020 1030 1040 1050 

CAAACTGACC GTTGCAAAAC TGAACATCGA TCAAAACCCT GGTAC7GCGC 

1060 1070 1080 1090 1100 

CGAAA7A7GG CA7CCG7GG7 A7CCCGAC7C 7GC7GC7G77 CAAAAACGG7 

1110 1120 1130 1140 1150 

GAAG7GGCGG CAACCAAAG7 GGG7GCAC7G 7C7AAAGG7C AG77GAAAGA 

1160 1170 1180 1190 1200 

G77CC7CGAC GC7AACC7GG CG7AAGGGAA 7TTCA7G77C GGGTGCCCCG 

1210 1220 1230 . 12*0 1250 

7CGC7AAAAA C7GGACGCCC GGCGI«AG7C ATGC7AAC77 AG7G77GACG 

1260 1270 1280 1290 1300 

GA7CCCCC7G CC7CGCGCG7 77CGG7GA7G ACGG7GAAAA CC7C7GACAC 

1310 1320 1330 1340 v 1350 

A7GCAGC7CC CGGAGACGG7 CACAGC77G7 C7G7AAGCGG A7GCCGGGAG 

1360 1370 1380 1390 1400 

CAGACAAGCC CG7CAGGGCG CG7CAGCGGG 7G77GGCGGG 7G7CGGGGCG 

1410 1420 1430 1440 1450 

CAGCCA7GAC CCAG7CACG7 AGCGA7AGCG GAG7G7A7AC 7GGC77AAC7 

1460 1470 1480 1490 1500 

A7GCGGCA7C AGAGCAGA77 GTAC7GAGAG 7GCACCA7A7 GCGG7G7GAA 

1510 1520 1530 1540 1550 

A7ACCGCACA GA7GCG7AAG GAGAAAA7AC CGCA7CAGGC GC7C77CCGC 

1560 1570 1580 1590 1600 

77CC7CGC7C AC7GAC7CGC 7GCGC7CGG7 CG77CGGC7G CGGCGAGCGG 

1610 1620 1630 1640 1650 

' 7A7CAGC7CA C7CAAAGGCG G7AA7ACGG7 7A7CCACAGA ATCAGGGGAT 

1660 1670 1680 1690 1700 

AACGCAGGAA AGAACA7G7G AGCAAAAGGC CAGCAAAAGG CCAGGAACCG . 

1710 1720 1730 1740 1750 

7AAAAAGGCC GCG77GC7GG CG77777CCA 7AGGC7CCGC CCCCC7GACG 

1760 1770 1780 1790 1800 

AGCA7CACAA AAA7CGACGC 7CAAG7CAGA GG7GGCGAAA CCCGACAGGA 

1810 1820 1830 1840 1850 

C7ATAAAGA7 ACCAGGCG77 7CCCCC7GGA AGC7CCC7CG 7GCGC7C7CC 
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TlGurZ 7 (continued) 



1860 
TG77CCGACC 
1910 
GAAGCG7GGC 
1960 
TAGGTCGTTC 
2010 
CGACCGC7GC 
2060 
GACACGACTT 
2110 
GCGAGGTATG 
2160 
CGGCTACACT 
2210 
TTACCTTCGG 
2260 
GCTGGTAGCG 
2310 
AAAAGGATCT 
2360 
AGTGGAACGA 
2410 
AGGATCTTCA 
2460 
CTAAAGTATA 
2510 
GTGAGGCACC 
2560 
TGACTCCCCG 
2610 
CCCCAG7GCT 
2660 
7A7CAGCAA7 
2710 
GCAACTTTAT 
2760 
AG7AAG7AG7 
2810 
CAGGCATCGT 
■ 2860 
GG77CCCAAC 
2910 
AGCGG77AGC 
2960 
CAG7G77A7C 
3010 
A7GCCA7CCG 
3060 
A77C7GAGAA 
3110 
CACGGGA7AA 
3160 
GGAAAACG77 



1870 
C7GCCGC77A 
1920 
GCT77C7CAA 
1970 
GC7CCAAGC7 
2020 
GCC77A7CCG 
2070 
ATCGCCAC7G 
2120 
7AGGCGG7GC 
2170 
AGAAGGACAG 
2220 
AAAAAGAG77 
2270 
G7GG777777 
2320 
CAAGAAGA7C 
2370 
AAAC7CACG7 
2420 
CC7AGATCC7 
2470 
TATGAG7AAA 
2520 
TATC7CAGCG 
2570 
TCG7G7AGA7 
2620 
GCAA7GA7AC 
2670 
AAACCAGCCA 
2720 
CCGCC7CCAT 
2770 
TCGCCAG77A 
2820 
GG7G7CACGC 
2870 
GA7CAAGGCG 
2920 
7CC77CGG7C 
2970 
AC7CA7GG77 
3020 
7AAGA7GC7T 
3070 
7AG7G7A7GC 
3120 
TACCGCGCCA 
3170 

Ctmijo f~ r- *» <** <-* 



I860 
CCGGA7ACC7 
1930 
7GC7CACGC7 
1980 
GGGC7G7G7G 
2030 
G7AACTA7CG 
2080 
GCAGCAGCCA 
2130 
7ACAGAG77C 
2180 
7AT7TGG7AT 
2230 
GG7AGCTC7T 
2280 
7GT77GCAAG 
2330 
C7TTGATC7T 
2380 
7AAGGGA77T 
2430 
7TTAAAT7AA 
2480 
CTTGGTC7GA 
2S30 
ATCTGTCTAT 
2580 
AAC7ACGA7A 
2630 
CGCGAGACCC 
2680 
GCCGGAAGGG 
2730 
CCAGTCTATT 
2780 
ATAG7T7GCG 
2830 
7CG7CGT77G 
2880 
AG77ACATGA 
2930 
C7CCGA7CG7 
2980 
A7GGCAGCAC 
3030 
77C7G7GAC7 
3080 
GGCGACCGAG 
3130 
CA7AGCAGAA 
3180 
AAAAC7C7CA 



1890 
G7CCGCC777 
1940 
G7AGG7A7C7 
1990 
CACGAACCCC 
2040 
TCTTGAG7CC 
2090 
C7GG7AACAG 
2140 
77GAAG7GG7 
2190 
CTGCGCTC7G 
2240 
GATCCGGCAA 
2290 
CAGCAGA77A 
2340 
77C7ACGGGG 
2390 
7GGTCATGAG 
2440 
AAATGAAG77 
2490 
CAG77ACCAA 
2540 
TTCG7TCA7C 
. 2590 
CGGGAGGGC7 
2640 
ACGC7CACCG 
2690 
CCGAGCGCAG 
2740 
AATTGTTGCC 
27 90 
CAACG77G77 
2840 
G7A7GGC77C 
2890 
7CCCCCA7G7 
2940 
7G7CAGAAG7 
2990 
7GCA7AA77C 
3040 
GGTGAGTACT 
3090 
7TGC7C77GC 
3140 
C77TAAAAG7 
3190 
AGGA7C77AC 



1900 
C7CCC77CGG 
1950 
CAG77CGG7G 
2000 
CCG77CAGCC 
2050 
AACCCGG7AA 
2100 
GA77AGCAGA 
2150 
GGCC7AAC7A 
2200 
C7GAAGCCAG 
2250 
ACAAACCACC 
2300 
CGCGCAGAAA 
2350 
7C7GACGC7C 
2400 
A77A7CAAAA 
2450 
77AAATCAA7 
2500 
TGC77AA7CA 
2550 
CATAG7TGCC 
2600 
7ACCA7CTGG 
2650 
GC7CCAGA77 
.2700 
AAG7GGTCC7 
2750 
GGGAAGC7AG 
2800 
GCCA77GC7G 
2850 
A77CAGCTCC 
2900 
7G7GCAAAAA 
2950 
AAG77GGCCG 
3000 
7C77AC7G7C 
3050 
■CAACCAAG7C 
3100 
CCGGCG7CAA 
3150 
GC7CA7CA7T 
3200 
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FIGURE 7 (continued) 

3210 3220 3230 3240 3250 

ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT 

3260 3270 3280 3290 3300 

TTACTTTCAC CAGCGTTTCT GGG7GAGCAA AAACAGGAAG GCAAAATGCC 

3310 3320 3330 3340 3350 

GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT 

3360 3370 3380 3390 3400 

CCTTTTTCAA TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG 

3410 3420 3430 3440 3450 

GATACATATT TGAATGTATT TAGAAAAATA AACAAATAGG GGTTCCGCGC 

3460 3470 3480 3490 3500 

ACATTTCCCC GAAAAGTGCC ACCTGACGTC TAAGAAACCA TTATTATCAT 

3510 3520 3530 3540 3550 

GACATTAACC TATAAAAATA GGCGTATCAC GAGGCCCTTT CGTC7TCAAG 



AA 
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FIGoTE 8 



10 20 30 40 50 

G77GACACA7 A7GAG7C77G 7GA7G7AC7G GCTGATT7CT ACGACCAG7T 
60 70 80 90 100 

CGCTGACCAG TTGCACGAGT CTCAATTGGA CAAAATGCCA GCAC77CCGG 

HO 120 130 140 150 

CTAAAGGTAA C77GAACC7C CG7GACA7C7 TAGAGTCGGA CTTCGCGTTC 

160 170 180 190 200 

GCGTAACGCC AAATCAATAC GACTCACTAT AGAGGGACAA AC7CAAGG7C 

210 220 230 240 250 

ATTCGCAAGA GTGGCCTTTA TGATTGACCT 7C77CCGG77 AATACGACTC 

260 270 280 290 300 

ACTATAGGAG AACCTTAAGG 777AAC777A AGACCCTTAA GTGTTAATTA 

310 320 330 340 350 

GAGATTTAAA TTAAAGAATT ACTAAGAGAG GACTTTAAG7 A7GCG7AAC7 

360 370 380 390 400 

7CGAAAAGA7 GACCAAACG7 7C7AACCG7A A7GC7CG7GA C77CGAGGCA 

410 420 430 440 450 

ACCAAAGG7C GCAAG77GAA 7AAGAC7AAG CG7GACCGC7 C7CACAAGCG 

460 470 480 490 500 

7AGC7GGGAG GG7CAG7AAG A7GGGACG77 7A7A7AG7GG 7AA7C7GGCA 

510 520 530 540 550 

CCGGA7CCGG 7A7GAAGAGA 77G77AAG7C ACGA7AA7CA A7AGGAGAAA 

560 570 580 590 600 

7CAA7A7GA7 CG777C7GAC A7CGAAGC7A ACGCCC7C77 AGAGAGCG7C 
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FIGURE 8 (continued) 



610 


620 


630 


640 


650 


ACTAAGTTCC 


AC7GCGGGG7 


7A7C7ACGAC 


7AC7CCACCG 


C7GAG7ACG7 


660 


670 


680 


690 


700 


AAGCTACCGT 


CCGAG7GAC7 


7CGG7GCG7A 


7C7GGATGCG 


C7GGAAGCCG 


710 


720 


730 


740 


750 


AGG77GCACG 


AGGCGG7C77 


A77G7G77CC 


ACAACGG7CA 


CAAG7A7GAC 


760 


770 


780 


790 


800 


GTTCCTGCAT 


7GACCAAAC7 


GGCAAAG77G 


CAA77GAACC 


GAGAG77CCA 


810 


820 


830 


840 


850 


CCTTCCTCGT 


GAGAAC7G7A 


77GACACCC7 


7G7G77G7CA 


CG77TGA77C 


860 


870 


880 


890 


900 


ATTCCAACCT 


CAAGGACACC 


GA7A7GGG7C 


77C7GCG77C 


CGGCAAG77G 


910 


920 


930 


940 


950 


CCCGGAAAAC 


GC777GGG7C 


7CACGCT77G 


GAGGCG7GGG 


G77A7CGC77 


960 


970 


980 


990 


1000 


AGGCGAGATG 


AAGGG7GAA7 


ACAAAGACGA 


C777AAGCG7 


A7GC7TGAAG 


1010 


1020 


1030 


1040 


1050 


AGCAGGG7GA 


AGAA7ACG77 


GACGGAA7GG 


AG7GG7GGAA 


C77CAACGAA 


1060 


1070 


1080 


1090 


1100 


GAGATGATGG 


AC7A7AACG7 


7CAGGACG77 


G7GG7AAC7A 


AAGCTC7CC7 


1110 


1120 


1130 


1140 


1150 


TGAGAAGCTA 


C7C7C7GACA 


AACA77AC77 


CCC7CC7GAG 


A77GAC777A 


1160 


1170 


1180 


1190 


1200 


CGGACGTAGG 


A7ACAC7ACG 


77C7GG7CAG 


AA7CCC77GA 


GGCCG77GAC 


1210 


1220 


1230 


1240 


1250 


ATTGAACATC 


G7GC7GCA7G 


GC7GC7CGC7 


AAACAAGAGC 


GCAACGGG77 


1260 


1270 


1280 


1290 


1300 


CCCGTTTGAC 


ACAAAAGCAA 


7CGAAGAG77 


G7ACG7AGAG 


77AGC7GC7C 


1310 


1320 


1330 


- 13*40 


1350 


GCCGCTCTGA 


G77GC7CCG7 


AAA77GACCG 


AAACG77CGG 


C7CG7GG7A7 


1360 


1370 


1380 


1390 


1400 


CAGCCTAAAG 


G7GGCAC7GA 


GA7G77C7GC 


CA7CCGCGAA 


CAGG7AAGCC 


1410 


1420 


1430 


1440 


. 1450 


ACTACCTAAA 


7ACCC7CGCA 


77AAGACACC 


7AAAG77GG7 


GG7A7C777A 


1460 


1470 


1480 


1490 


1500 


AGAAGCC7AA 


GAACAAGGCA 


CAGCGAGAAG 


GCCG7GAGCC 


77GCGAAC77 


1510 


1520 


1530 


1540 


1550 


GA7ACCCGCG 


AG7ACG77GC 


7GG7GC7CC7 


7ACACCCCAG 


77GAACA7G7 


1560 


1570 


1580 


1590 


1600 


TGTGTTTAAC 


CC77CG7C7C 


G7GACCACA7 


7CAGAAGAAA 


C7CCAAGAGG 


1610 


1620 


1630 


1640 


1650 


CTGGG i GsaGT 


CCCGACCAAG 


TACACCGaxA 


AGGv*iGwiCC 


7G7GG7GGAC 


1660 


1670 


1680 


1690 


1700 


GATGAGG7AC 


7CGAAGGAG7 


ACG7G7AGA7 


GACCC7GAGA 


AGCAAGCCGC 


1710 


1720 


1730 


1740 


1750 


7A7CGACC7C 


A77AAAGAG7 


AC77GA7GA7 


TCAGAAGC3A 


A7CGGACAG7 


1760 


1770 


1780 


1750 


1800 


C7GC7GAGGG 


AGACAAAGCA 


TGGC77CG77 


AiU* »SJU •OA 


Gvj/w Gva i. aav) . 


1810 


1820 


1830 


1840 


1850 


A77CA7GG77 


C7G77AACCC 


TAA7GGAGCA 


G77ACGGG7C 


G7GCGACCCA 


1860 


1870 


1880 


1890 


1900 


7GCG7TCCCA 


AACC77GCGC 


AAA77CCGGG 


7G7ACG77C7 


CC77A7GGAG 


1910 


1920 


1930 


1940 


1950 


AGCAG7G7CG 


CGC7GC7777 


GGCGC7GAGC 


ACCA777GGA 


7GGG A7AACT 
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FIGURE 3 (continued) 



1960 


1570 


1980 


1990 


20CC 


GG7AAGCCTT 


GGG77CAGGC 


7GGCA7CGAC 


GCA7CCGG7C 


77GAGC7ACG 


2010 


2020 


2030 


2040 


2050 


CTGCTTGGCT 


CAC77CA7G3 


C7CGC777GA 


7AACGGCGAG 


7ACGC7CACG 


2060 


2070 


2080 


2090 


2100 


AGA77C77AA 


CGGCGACATC 


CACAC7AAGA 


ACCAGATAGC 


7GC7GAAC7A 


2110 


2120 


2130 


2140 


2150 


CC7ACCCGAG 


A7AACGC7AA GACG77CA7C 


7A7GGG77CC 


7CTA7GG7GC 


2160 


2170 


2180 


2190 


2200 


7GG7GA7GAG 


AAGATTGGAC 


AGA77G77GG 


7GCTGGTAAA 


GAGCGCGG7A 


2210 


2220 


2230 


2240 


2250 


AGGAACTCAA 


GAAGAAA77C 


C77GAGAACA 


CCCCCGCGAT 


TGCAGCAC7C 


2260 


2270 


2280 


2290 


2300 


CGCGAGTCTA 


TCCAACAGAC 


AC77G7CGAG 


7CC7C7CAA7 


GGG7AGC7GG 


2310 


2320 


2330 


2340 


2350 


TGAGCAACAA 


G7CAAGTGGA AACGCCGC7G 


GAT7AAAGG7 


C7GGA7GG7C 


2360 


2370 


2380 


2390 


2400 


GTAAGGTACA 


CG77CG7AG7 


CC7CACGC7G 


CC77GAA7AC 


CC7AC7GCAA 


2410 


2420 


2430 


2440 


2450 


TCTGCTGGTG 


C7C7CA7C7S 


CAAAC7G7GG 


A77A7CAAGA 


CCGAAGAGA7 


2460 


2470 


2480 


2490 


2500 


GCTCGTAGAG 


AAAGGC77GA 


AGCA7GGC7G 


GGA7GGGGAC 


77TGCG7ACA 


: 2510 


2520 


2530 


2540 


2S50 


TGGCATGGGT 


ACA7GA7GAA 


A7CCAAG7AG 


GC7GCCG7AC 


CGAAGAGA77 


2560 


2570 


2580 


2590 


2600 


GCTCAGGTGG 


7CA77GAGAC 


CGCACAAGAA 


GCGA7GCGC7 


GGG7TGGAGA 


2610 


2620 


2630 


2640 


2650 


CCACTGGAAC 


77CCGG7G7C 


T7C7GGA7AC 


CGAAGG7AAG 


A7GGG7CC7A 


2660 


2670 


2680 


. 2690 


2700 


ATTGGGCGAT 


77GCCAC7GA 


7ACAGGAGGC 


7AC7CA7GAA 


CGAAAGACAC 


2710 


2720 


2730 


2740 


2750 


TTAACAGGTG 


C7GC77C7GA 


AA7GC7AG7A GCC7ACAAAT 


7TACCAAAGC 


2760 


2770 


2780 


2790 


..2800 


TGGGTACACT 


G7C7ATTACC 


C7A7GC7GAC 


TCAGAGTAAA 


GAGGAC77GG 


2810 


2820 


2830 


2840 


2850 


TTGTATGTAA 


GGA7GG7AAA 


777AG7AAGG 


77CAGG77AA 


AACAGCCACA 


2860 


2870 


' 2880 


2890 


2900 


ACGG77CAAA 


CCAACACAGG 


AGA7GCCAAG 


CAGG77AGGC 


7AGG7GGATG 


2910 


2920 


2930 


2940 


2950 


CGGTAGGTCC 


GAA7A7AAGG 


A7GGAGAC77 


7GACA77C77 


GCGG77G7GG 


2960 


2970 


2980 


2990 


3000 


TTGACGAAGA 


7G7GC77A77 


T7CACA7GGG 


ACGAAG7AAA 


AGG7AAGACA 


3010 


3020 


3030 


3040 


3050 


TCCATGTG7G 


7CGGCAAGAG 


AAACAAAGGC 


A7AAAAC7A7 


AGGAGAAAT7 


3060 


3070 


3080 






ATTATGGC7A 


7GACAAAGAA 


A777CCGGA7 


C 
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FIGCKE 9 



10 


20 


30 


40 


50 


AA7GC7AC7A 


C7A77AG7AG 


AA77GA7GCC 


ACC7777CAG 


CTCGCGCCCC 


60 


70 


80 


90 


100 


AAA7GAAAAT 


A7AGC7AAAC 


AGG77ATTGA 


CCA777GCGA 


AA7G7A7C7A 


110 


120 


130 


140 


150 


A7GG7CAAAC 


7AAA7C7ACT 


CG77CGCAGA 


AT7GGGAA7C 


AACTG77ACA 


160 


170 


180 


190 


200 


TGGAATGAAA 


C77CCAGACA 


CCG7AC777A 


G77GCA7A7T 


7AAAACA7G7 


210 


220 


230 


240 


250 


TGAGCTACAG 


CACCAGATTC 


AGCAA77AAG 


C7C7AAGCCA 


7CCGCAAAAA 


260 


270 


280 


290 


300 


TGACCTCTTA 


7CAAAAGGAG 


CAA7TAAAGG 


TAC7C7C7AA 


7CC7GACC73 


310 


320 


330 


340 


350 


TTGGAGTTTG 


C77CCGG7CT 


GG77CGC7TT 


GAAGCTCGAA 


TTAAAACGCG 


360 


370 


380 


390 


4C0 


ATATTTGAAG 


7C777CGGGC 


T7CC7C77AA 


7C77777GA7 


GCAA7CCGC7 


410 


420 


430 


440 


450 


77GC77C7GA 


C7A7AA7AG7 


CAGGG7AAAG 


ACC7GA7777 


7GA777A7GG 


4 60 


470 


480 


4 90 


500 


TCATTC7CGT 


777C7GAAC7 


G777AAAGCA 


77TGAGGGGG 


A77CAA7GAA 


510 


520 


530 


540 


550 


TATTTATGAC 


GA77CCGCAG 


7A77GGACGC 


TA7CCAG7C7 


AAACA7777A 


560 


570 


580 


590 


600 


C7ATTACCCC 


C7C7GGCAAA 


AC77C7777G 


CAAAAGCC7C 


7CGC7A7777 


610 


620 


630 


640 


650 


GGTTTTTATC 


G7CG7C7GG7 


AAACGAGGGT 


7A7GA7AG7G 


77GC7C77AC 


660 


670 


680 


690 


700 


TATGCCTCGT 


AA77CC7777 


GGCG77A7G7 


A7C7GCA77A 


G77GAA7G75 


710 


720 


730 


740 


750 


GTATTCCTAA 


A7C7CAAC7G 


A7GAA7C777 


C7ACC7G7AA 


7AA7G77G77 


7 60 


770 


780 


790 


800 


CCGTTAGTTC 


G7777A77AA 


CG7AGAT777 


7C77CCCAAC 


G7CC7GAC73 


610 


620 


830 


840 


850 


G7ATAATGAG 


CCAG77C77A 


AAAT0GCA7A 


AGG7AAT7CA 


CAA7GA77AA 


' ' 860 


870 


880 


890 


900 
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FXS3E 9 (continued) 



AGTTwaAA 1 I 


AaaWwaiv. • w 


AAftrrrAAT - * 

ArtUWCwnAl m 


*** AC"^" AC C G 7 








§30 


940 


950 


CTCGaC>vw , w\s 


L>aaVj\>V* a *aa 


l WAW « W A/% I w 


AGCAGC777G 

AwWAWW * * a W 


77ACG77GA7 


q *o 




Q30 

7 W w 


990 


1000 


T 1 VjVjrtj x An X w 


AAl AX WWWW ii 


TCTTGTCAAG 

A w X iVJl WAAVI 


ATTAC7C77G 


A7GAAGG7CA 


1010 

X w X W 


1020 


1030 


lt)40 


1050 


\j^La«^c a a a 


WWW WW A WW A w 


TGTACACCG7 

X \J ± AW AW WW A 


7CA7C7G7CC 


7C777C AAAG 


1 OfiO 

1UOU 


1070 

Xw / w 


loao 

X w w w 


1090 


1100 


A 1 WO x w avj X X 


WWW 4 A WWW 4 * 


ATGATTGACC 

A A W A A A W AW W 


G7C7GCGCC7 


CG77CCGGC7 


1 1 in 

X X XV 


1120 

X X X W 


1130 


1140 


1150 




W AO W AWO X WW 


CGGATT^CGA 

WWW A A A *WWA 


CACAA777A7 


CAGGCGA7GA 


11 60 

X X ww 


1170 


1180 


H90 


1200 


X awaaa A w a w 


rGTTGTACTT 

Ww X A W X AW A A 


TG777CGCGC 

AW A A A WWWW>» 


7TGG7A7AAT 


CGC7GGGGG7 


1210 
X X X w 


1220 


1230 


1240 


1250 


TAAAGATGAG 

W AAAW A A >3 A W 


TGTTTTAGTG 


7AT7C777CG 


CCTC777CG7 


TTTAGGTTSG 


1260 

A x u w 


1270 


1280 


1290 


1300 


X ^ X X WW A A 


GTGGCATTAC 


G7A7777ACC 


CGT7TAA7GC 


AAAC77CC7C 


1310 


1320 


1330 


1340 


1350 


ATGAAAAAGT 
Al uAAnAAva x 


CTTTAGTCCT 

W X A A •» W A W W A 


CAAAGCC7C7 


G7AGCCG77G 


C7ACCCTCG7 


1360 


1370 


1380 


13 90 


1400 


TCCGATGC7G 


TCTTTCGCTG 


C7GAGGG7GA 


CGATCCCGCA 


AAAGCGGCC7 


1410 


1420 


1430 


1440 


1450 


TTAAC7CCC7 

x x haw a w w w a 


GCAAGCC7CA 


GCGACCGAA7 


A7A7CGG77A 


7GCG7GGGCG 


14 60 


1470 


1480 


1490 


1500 


ATGGTTGTTG 

*\ A WW A A W * * W 


7CA7TG7CGG 


CGCAAC7A7C 


GG7A7CAAGC 


7G777AAGAA 


1510 


1520 


1530 


1540 


1550 


ATTCACC7CG 

X»X A WAW W A WW 


AAAGCAAGC7 


GA7AAACCGA 


7ACAA77AAA GGC7CC7777 


1560 


1570 


1580 


1590 


1600 


GGAGCCTTTT 

wwAViVV a A a a 


TTTTTGGAGA 


7777CAACG7 


GAAAAAA77A 


77A77CGCAA 


1610 


1620 


1630 


1640 


1650 


TTGCTTTAGT 

X A WW X X * a w * 


7G77CC777C 


7A77C7CAC7 


CCGC7GAAAC 


7G77GAAAG7 


1660 


1670 


1680 


• 1690 


1700 


TGTTTAGCAA 

A w X x x aww 


AACCCCATAC 


AGAAAA77CA 


7TTAC7AACG 


7C7GGAAAGA 


1710 


1720 


, 1730 


1740 


1750 




T'T'AGATCGTT 


ACGC7AAC7A 


7GAGGG77G7 


C7G7GGAA7G 


1760 


1770 


1780 


1790 


1800 


CTACAGGCGT 

W A AvoWvV W A 


TG7AG7TTGT 


AC7GG7GACG 


AAAC7CAG7G 


77ACGG7ACA 


1810 


1820 


1830 


1840 


1850 


TGGG7TCCTA 


7TGGGC77GC 


7A7CCC7GAA 


AA7GAGGG7G 


G7GGC7C7GA 


1860 


1870 


1880 


1890 


1900 


GGG7GGCGG7 


7C7GAGGG7G 


GCGG77C7GA 


^GGG7GGCGG7 


AC7AAACC7C 


1910 


. 1S20 


1930 


1940 


1950 


C7GAG7ACGG 


7GA7ACACC7 


ATTCCGGGC7 


A7AC77A7A7 


CAACCC7C7C 


1960 


1970 


1980 


1990 


2000 


GACGGCAC77 


ATCCGCC7GG 


7AC7GAGCAA 


AACCCCGC7A 


A7CC7AA7CC 


2010 


2020 


2030 


2040 


2050 


77C7C77GAG 


GAG7C7CAGC 


C7C77AA7AC 


777CA7G777 


CAGAA7AA7A 


2060 


2070 


2080 


2090 


2100 


GGTTCCGAAA 


7AGGCAGGGG 


GCA77AACTG 


777A7ACGGG 


CAC7G7TACT 


2110 


2120 


2120 


2140 


2150 


CAAGGCAC7G 


ACCCCG77AA 


AACTTA7TAC 


CAG7ACAC7C 


C7G7A7CA7C 


2160 


2170 


2180 


2190 


2200 


AAAAGCCATG 


7A7GACGC77 


AC7GGAACGG 


7AAA77CAGA 


GAC7GCGC77 


2210 


2220 


2230 


2240 


2250 
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FXGwHE 9 (continued) 



7CCA77C7GG 
2260 
TCGTCTGACC 
2310 
TGGTTCTGGT 
2360 
AGGGTGGC GG 
2410 
GATTTTGATT 
2460 
AAATGCCGAT 
2S10 
CTGTCGCTAC 
2560 
TCCGGCCTTG 
2610 
TTCCCAAATG 
2660 
ATTTCCGTCA 
2710 
TTTGTCTTTA 
2760 
AATAAACTTA 
2810 
TTATGTATGT 
2860 
TAATCATGCC 
2910 
TTCCTTCTGG 
2960 
CTTCGGTAAG 
3010 
GGCTTAACTC 
3060 
CCC7C7GAC7 
3110 
7CCC7G7777 
3160 
ACG7TAAACA 
3210 
TGTTTATTTT 
3260 
TTGG7AAGAT 
3310 
C77GA777AA 
3360 
GCCTCGCG77 
3410 
C7A77GGGCG 
34 60 
G77C7CGA7G 
3510 
G G AAAG AC AG 
2560 



C777AA7GAA 
2270 
TGCC7CAACC 
2320 
<*GC<»GC7C7G 
2370 
C7C7GAGGGA 
2420 
A7GAAAAGA7 
2470 
GAAAACGCGC 
2520 
7GA77ACGG7 
2570 
C7AA7GG7AA 
2620 
GC7CAAGTCG 
2670 
A7A777ACC7 
2720 
GCGC7GG7AA 
2770 
7TCCG7GG7G 
2820 
AT77TCTACG 
2870 
AG77CT777G 
2920 
7AAC777G77 
2970 
A7AGC7A77G 
3020 
AA77C77G7G 
3070 
77G77CAGGG 
3120 
7A7G77A77C 
3170 
AAAAA7CG77 
3220 
G7AAC7GGCA 
3270 
7CAGGA7AAA 
3220 
GGC77CAAAA 
3370 



C77AGAA7AC 
3420 

3470 
AGTGCGGTAC 
3S2C 
CCGA77A77G 
3570 



GA7CCA77CG 
2280 
7CC7GTCAAT 
2330 
AGGG7GG7GG 
2380 
GGCGG77CCG 
2430 
GGCAAACGC7 
2480 
7ACAGTC7GA 
2S30 
GC7GCTA7CG 
2580 
7GG7GCTACT 
2630 
G7GACGG7GA 
2680 
7CCC7CCC7C 
2730 
ACCA7A7GAA 
2780 
7C777GCG77 
2830 
7T7GCTAACA 
2880 
GG7A77CCGT 
2930 
CGGC7A7C7G 
2980 
C7A77TCAT7 
3030 
GG77A7C7C7 
. 3080 
7G77CAG77A 
3130 
7C7C7G7AAA 
3180 
7C77A777GG 
3230 
AA77AGGC7C 
3280 
A77G7AGC7G 
3330 
CC7CCCGCAA 
3380 
CGGATAAGCC 
3420 
7CC7ACGA7G 
3480 
77GG777AA7 
3530 
AT7GG777C7 
3580 



777G7GAA7A 



2290 
GC7GGCGGCG 
2340 
C7C7GAGGG7 
2390 
G7GG7GGC7C 
2440 
AA7AAGGGGG 
2490 
CGC7AAAGGC 
2540 
A7GG77TCA7 
2590 
GG7GA7777G 
2640 
7AA77CACC7 
2690 
AA7CGG77GA 
2740 
7777C7A77G 
2790 
7C7777A7A7 
2840 
7AC7GCG7AA 
2890 
7AT7AT7GCG 
2940 
C77AC7777C 
2990 
G777C77GC7 
3040 
C7GA7A77AG 
3090 
A7TC7CCCG7 
3140 
GGC7GC7A7T 
3190 
A77GGGA7AA 
3240 
7GGAAAGACG 
3290 
GG7GCAAAA7 
3340 
G7CGGGAGG7 
3290 
77C7A7A7C7 
3440 
AAAA7AAAAA 
34 90 
ACCCG77C77 
3540 
ACA7GC7CG7 
3590 



7CAAGGCCAA 

2300 
GC7C7GG7GG 

2350 
GGCGG7TC7G 

2400 
TGGT7CCGGT 

2450 
C7ATGACCGA 

2500 
AAAC77GA7T 

2550 
7GG7GACG77 

2600 
C7GGC7C7AA 



77AA7GAA7A 
2700 
A7G7CGCCC7 
2750 
A77G7GACAA 
2800 
G77-GCCACC7 
2850 
7AAGGAGTCT 
2900 
777CC7CGG7 
2950 
77AAAAAGGG 
3000 
C77A77A7TG 
3050 
CGC7CAAT7A 
"3100 
C7AA7GCGC7 
3150 
77CA77777G 
3200 
A7AA7A7GGC 
3250 
C7CG77AGC3 
3300 
AGCAAC7AA7 
3350 
7CGC7AAAAC 
3400 
-GA777GC7T3 
3450 
CGGC77GC77 
3500 
GGAA7GA7AA 
3550 
AAA77AG7A7 
360C 



36 



EP 0 386 857 B1 



FIGwKS 9 (continued) 

GGGATATTAT TTTTCTTGTT CAGGAC77A7 CTATTGTTGA TAAACAGGCG 
3610 3620 3630 3640 3650 

CGTTCTGCAT TAGCTGAACA 7G77G7T7A7 7G7CG7CG7C TGGACAGAAT 
3660 3670 36S0 3690 3700 

TACTTTACCT TTTG7CGG7A C777A7A77C 7CTTA77AC7 GGC7CGAAAA 
3710 3720 3730 3740 3750 

7GCC7C7GCC 7AAAT7ACA7 G77GGCG77G 7TAAATA7GG CGA7TCTCAA 
3760 3770 3780 3790 3800 

77AAGCCC7A C7G7TGAGCG 77GGC777A7 AC7GG7AAGA AT7TGTA7AA 
3810 3820 3830 3840 3850 

CGCA7A7GAT ACTAAACAGG C77T77C7AG 7AA77A7GA7 TCCGGTG777 
3860 3870 3880 3890 3900 

A77C7TA777 AACGCCT7A7 77A7CACACG G7CGG7A7T7 CAAACCA7^A 
3910 3920 3930 3940 3950 

AA777AGG7C AGAAGA7GAA A77AAC7AAA ATATA777GA AAAAG7777C 
3960 3970 3980 3990 4000 

TCGCG77C77 7G7C77GCGA TTGGA77TGC A7CAGCA777 ACATATAG77 
4010 4020 4030 4040 4050 

A7A7AACCCA ACC7AAGCCG GAGG7TAAAA AGG7AG7CTC 7CAGACC7AT 
4060 4070 4080 4090 4100 

GA7777GA7A AA7TCAC7A7 7GACTC77CT CAGCG7C77A A7C7AAGC7A 
4110 4120 4130 4140 4150 

7CGC7ATG77 . 77CAAGGA77 C7AAGGGAAA A77AA77AA7 AGCGACGA77 
4160 4170 4180 4190 4200 

7ACAGAAGCA AGG77A77CA C7CACA7A7A T7GA777A7G 7AC7G777CC 
4210 4220 4230 4240 4250 

A77AAAAAAG G7AA77CAAA 7GAAA77G7T AAA7G7AA77 AA77T7G777 
4260 4270 4280 4290 4300 

7C77GA7G77 7G777CA7CA 7C77C7777G C7CAGG7AA? 7GAAA7GAA7 
4310 4320 4330 4340 4350 

AAT7CGCC7C 7GCGCGA777 7G7AAC77GG TAT7CAAAGC AA7CAGGCGA 
4360 4370 4380 4390 4400 

A7CCG77A77 G777C7CCCG A7G7AAAAGG 7AC7G77AC7 G7A7ATTCA7 
4410 4420 . 4430 4440 4450 

C7GACG77AA AC77GAAAA7 C7ACGCAA77 7C777A777C TG7777ACG7 
4460 4470 4480 4490 4500 

GC7AA7AA77 77GA7A7GG7 7GG77CAA77 CC77CCA7AA 77CAGAAG7A 
4510 4520 4530 4540 4550 

TAA7CCAAAC AA7CAGG7A7 A7A77GA7GA A77GCCA7CA 7CTGA7AA7C 
4560 4570 4S80 4590 4600 

AGGAA7ATGA 7GA7AA77CC GC7CC77C7G G7GG777C77 7G77CCGCAA 
4610 4620 4630 4640 4650 

AA7GA7AA7G T7AC7CAAAC 7777AAAA77 AA7AACG77C GGGCAAAGGA 
4660 4670 4680 4690 4700 

TTTAA7ACGA G77G7CGAA7 7G7TTG7AAA G7C7AA7AC7 7C7AAATCC7 
4710 4720 4730 4740 4750 

CAAA7G7A77 A7C7A77GAC GGC7C7AA7C 7A77AG77G7 7AG7GCACC~ 
4760 4770 4780 4790 48 00 

AAAGA7A777 7AGA7AACC7 7CC7CAA77C C777C7AC7G 77GA777GC ' 

4810 4820 4830 4840 4850 

AAC7GACCAG A7A77GAT7G AGGG777GA7 A777GAGG7T CAGCAAGG7G 
4860 4870 4860 4690 4900 

A7GC777AGA 77T77CA777 GC7GC7GGC7 C7CAGCG7GG CAC7G77GCA 
4910 4920 4920 4940 4950 
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FIGURE 9 (continued) 



GGCGGTGT7A 


A7ACTGACC3 


CC7CACC7C7 


G7777A7C77 


^^^^^^^^^ «— 


4960 


4970 


4980 


4990 


5000 


TTCGTTCGGT 


ATT777AA7G 


GCGATG777T 


AGGGCTA7CA 


G77CGCGCA7 


5010 


5020 


5030 


5040 


5050 


7AAAGAC7AA 


7AGCCA77CA 


AAAATATTGT 


CTGTGCCACG 


TATTCTTACG 


5060 


5070 


5080 


•5090 


5100 


CTTTCAGG7C 


AGAAGGG77C 


TATCTCTGTT 


GGCCAGAATG 


TCCC77TTAT 


5110 


5120 


5130 


5140 


5150 


TACTGGTCGT 


G7GAC7GG7G 


AATC7GCCAA 


TGTAAATAAT 


CCATT7CAGA 


5160 


5170 


5180 


5190 


5200 


CGATTGAGCG 


7CAAAA7G7A 


GGTATTTCCA 


TGAGCGTTTT 


TCCTGTTGCA 


5210 


5220 


5230 


5240 


5250 


ATGGCTGGCG 


G7AAIA77G7 


TCTGGATATT 


ACCAGCAAGG 


CCGA7AG777 


5260 


5270 


5280 


5290 


S300 


GAGTTCTTCT 


AC7CAGGCAA 


GTGATGTTAT 


7ACTAATCAA 


AGAAG7A77G 


5310 


5320 


5330 


5340 


5350 


CTACAACGGT 


7AA7T7GCG7 


GATGGACAGA 


CTCTTTTACT 


CGGTGGCC7C 


5360 


5370 


5380 


5390 


5400 


ACTGATTATA 


AAAACAC77C 


TCAAGATTCT 


GGCGTACCGT 


TCCTGTCTAA 


5410 


5420 


5430 


5440 


5450 


AATCCCTTTA 


A7CGGCC7CC 


7G777AGC7C 


CCGCTCTGAT 


7CCAACGAGG 


5460 


5470 


5480 


5490 


5500 


AAAGCACGTT 


ATACG7GC7C 


GTCAAAGCAA 


CCATAGTACG 


CGCCC7G7AG 


5510 


5520 


5530 


5540 


5550 


CGGCGCATTA 


AGCGCGGCGG 


GTGTGGTGGT 


TACGCGCAGC 


G7GACCGC7A 


5560 


5570 


5580 


5590 


S600 


CACTTGCCAG 


CGCCC7AGCG 


CCCGCTCCTT 


7CGC7T7C77 


CCC77CC777 


5610 


5620 


5630 


5640 


5650 


CTCGCCACGT 


TCGCCGGC77 


TCCCCGTCAA 


GCTCTAAATC 


GGGGGC7CCC 


5660 


5670 


5680 


5690 


5700 


TTTAGGGTTC 


CGA777AG7G 


CTTTACGGCA 


CCTCGACCCC 


AAAAAAC77G 


5710 


5720 


5730 


5740 


5750 


ATTTGGGTGA 


7GG77CACG7 


AGTGGGCCAT 


CGCCCTGATA 


GACGGT7777 


5760 


5770 


: . 5780 


5790 


5800 


CGCCCTTTGA 


CG77GGAG7C 


CACG77C777 


AATAG7GGAC 


7C77G77CCA 


5810 


5620 


5830 


5840 


5850 


AACTGGAACA 


ACAC7CAACC 


CTATCTCGGG 


C7AT7C7777 


GA777A7AAG 


5860 


5870 


5880 


5890 


5900 


GGATTTTGCC 


GA777CGGAA 


CCACCATCAA 


ACAGGATTTT 


CGCC7GC7GG 


5910 


5920 


5930 


5940 


5950 


GGCAAACCAG 


CG7GGACCGC 


TTGCTGCAAC 


TCTCTCAGGG 


CCAGGCGG7G 


5960 


5970 


S980 


5990 


6000 


AAGGGCAA7C~AGC7G77GCC 


CG7C7CGC7G 


G7GAAAAGAA 


AAACCACCC7 


. 6010 


6020 


6030 


6040 


6050 


GGCGCCCAA7 


ACGCAAACC3 


CCTCTCCCCG 


CGCGT7GGCC 


GA77CA77AA 


6060 


6070 


6060 


6090 


6100 


7CCAGC7GGC 


ACGACAGG77 


7CCCGAC7GG 


AAAGCGGGCA 


G7GAGCGCAA 


6110 


6120 


6130 


6140 


6150 


CGCAA77AA7 


G7GAG77ACC 




GGCACCCCAG 


GC777ACAC7 


6160 


6170 


6180 


6190 


6200 


77A7GC77CC 




77G7G7GGAA 


77G7GAGC3G 




6210 


6220 


6230 


6240 


6250 


CACACAGGAA 


ACAGC7A7GA 


CCATGATTAC 


GAA77CGAGC 


7CGCCCGGGG 


6260 


6270 


6260 


62S0 


6300 
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FTGG3E 9 (continued) 

ATCTGCCTGA A7AGG7ACGA 777AC7AAC7 GGAAGAGGCA C7AAA7GAAC 
6310 6320 6330 6340 6j50 

ACGATTAACA TCGCTAAGAA CGAC77C7C7 GACATCGAAC TGGC7GC7A7 
6360 6370 6330 6390 6400 

CCCGTTCAAC ACTCTGGCTG ACCATTACGG TGAGCGTTTA GCTCGCGAAC 
6410 6420 6430 6440 645./ 

AG77GGCCC7 TGAGCATGAG TCTTACGAGA TGGG7GAAGC ACGC A *CCGC 
6460 6470 6480 6490 6500 

AAGATGTTTG AGCGTCAACT TAAAGCTGGT GAGGTTGCGG ATAACGC7GC 
6510 6520 6530 6540 6550 

CGCCAAGCCT CTCATCACTA CCCTACTCCC TAAGATGATT GCACGCATCA 
6560 6570 6580 6590 6600 

ACGACTGG7T TGAGGAAGTG AAAGCTAAGC GCGGCAAGCG CCCGACAGCC 
6610 6620 6630 6640 6650 

TTCCAGTTCC TGCAAGAAAT CAAGCCGGAA GCCGTAGCG7 ACA7CACCA7 
6660 6670 6630 6690 6700 

TAAGACCACT C7GGC77GCC TAACCAGTGC TGACAATACA ACCGT7CAGG 
6710 6720 6730 6740 6750 

CTG7AGCAAG CGCAA7CGG7 CGGGCCA77G AGGACGAGGC 7CGC77CGG7 
6760 6770 6780 6790 6800 

CG7A7CCG7G ACC77GAAGC 7AAGCAC77C AAGAAAAACG 77GAGGAACA 
6810 6820 6830 6840 6850 

AC7CAACAAG CGCG7AGGGC ACG7C7ACAA GAAAGCA777 A7GCAAG77G 
6860 6870 6880 6890 6900 

7CGAGGC7GA CA7GC7C7C7 AAGGG7C7AC 7CGG7GGCGA GGCG7GG7C7 
6910 6920 6930 6940 6950 

7CG7GGCA7A AGGAAGAC7C 7A77CA7G7A GGAGTACGC7 GCATCGAGA7 
6960 6970 6980 6990 7000 

GC7CA77GAG 7CAACCGGAA 7GG77AGC77 ACACCGCCAA AA7GC7GGCG 
7010 7020 7030 7040 7050 

7AG7AGG7CA AGAC7C7GAG AC7A7CGAAC 7CGCACC7GA A7ACGC7GAG 
7060 7070 7080 7090 7100 

GC7A7CGCAA CCCG7GCAGG 7GCGC7GGC7 GGCATC7C7C CGA7G77CCA 
7110 7120 , 7130 7140 „.. 715 .£ 

ACC77GCG7A G77CC7CC7A AGCCG7GGAC 7GGCA77AC7 GG7GG7GGC7 
7160 7170 7180 7190 7200 

A77GGGC7AA CGG7CG7CG7 CC7C7GGCGC 7GG7GCG7AC 7CACAG7AAG 
7210 7220 7230 7240 7250 

AAAGCAC7GA 7GCGC7ACGA AGACG777AC A7GCC7GAGG 7G7ACAAAGC 
7260 7270 7280 7290 7300 

GA77AACA77 GCGCAAAACA CCGCA7GGAA AA7CAACAAG AAAG7CC7AG 
7310 7320 7330 7340 7350 

CGG7CGCCAA CG7AA7CACC AAG7GGAAGC A77G7CCGG7 CGAGGACATC 
7360 7370 7380 7390 7400 

CCGCGATTG AGCG7GAAGA AC7CCCGA7G AAACCGGAAG ACA7CGACAT 
7410 7420 7430 7440 7450 

GAA7CC7GAG GC7C7CACCG CG7GGAAACG 7GC7GCCGC7 GC7G7GTACC 
7460 7470 7480 7490 75CC 

GCAAGGACAA GGC7CGCAAG 7C7CGCCG7A 7CAGCC77GA G77CATGC77 
75 i0 7520 7530 7540 755C 

GAGCAAGCCA A7AAG777GC 7AACCA7AAG GCCA7C7GG7 7CCC77ACAA 
7560 7570 7580 7590 7600 

CA7GGAC7GG CGCGG7CG7G 777ACGC7G7 G7CAA7G77C AACCCSCAAG 
7610 7620 7630 7640 7650 
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FIGuKE 9 (continued) 



G7AACGA7A7 


GACCAAAGGA 


CTGC7TACGC 


7GGCGAAAGG 


7AAACCAATC 


7660 


7670 


7680 


7690 


7700 


GGTAAGGAAG G77AC7AC7G 


GC7GAAAA7C 


CACGG7GCAA 


AC7G7GCGGG 


7710 


7720 


7730 


7740 


7750 


TGTCGATAAG 


G77CCG77CC 


CTGAGCGCAT 


CAAGTTCA7T 


GAGGAAAACC 


7760 


7770 


7780 


77 90 


7800 


ACSAGAACAT 


CA7GGC77GC 


GCTAAGTCTC 


CACTGGAGAA 


CAC77CG7GG 


7810 


7820 


7830 


7840 


7850 


GC7GAGCAAG 


A77C7CCG77 


CTGCTTCC7T 


GCGTTCTGCT 


77GAG7ACGC 


7860 


7870 


7880 


7890 
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TGGGGTACAG 


CACCACGGCC 


TCAGCTATAA 


CTGCT-CCCTT 


CCGC7GGCG7 


7910 


7920 


7930 


7940 


7950 


TTGACGGGTC 


T7GC7C7GGC A7CCAGCAC7 


TCTCCGCGAT 


GCTCCGAGA7 


7960 


7970 


7980 


7990 


8000 


GAGGTAGG7G 


G7CGCGCGG7 


TAACTTGCTT 


CCTAGTGAAA 


CCG77CAGGA 


8010 


8020 


8030 


8040 


8050 


CATCTACGGG 


A77GT7GC7A 


AGAAAGTCAA 


CGAGA77C7A 


CAAGCAGACG 


8060 


8070 


8080 


8090 


8100 


CAATCAATGG 


GACCGA7AAC 


GAAG7AG77A 


CCGTGACCGA 


7GAGAACAC7 


8110 


8120 


8130 


8140 


8150 


GGTGAAATCT 


C7GAGAAAG7 


CAAGCTGGGC 


ACTAAGGCAC 


7GGC7GG7CA 


8160 


8170 


8180 


8190 


8200 


ATGGCTGGCT 


TACGG7G77A 


CTCGCAGTGT 


GACTAAGCGT 


7CAG7CA7GA 


8210 


8220 


8230 


8240 


8250 


CGCTGGCTTA 


CGGG7CCAAA 


GAGTTCGGCT 


TCCGTCAACA 


AG7GC7GGAA 


8260 


8270 


8280 


8290 


8300 


GA7ACCA77C 


AGCCAGC7A7 


TGATaCCGGC 


AAGGGTCTGA 


7G77CAC7CA 


8310 


8320 


a ^ *5 r\ 

8330 


8340 


8350 


GCCGAA7CAG 


GC7GC7GGA7 


ACA7GGC7aA 


GCTGATT7GG 


GAA7C7G7GA 


8360 


8370 


8380 


8390 


8400 


GCGTGACGGT 


GG7AGC7GCG 


G77GAAGCAA 


TGAACTGGC7 


7AAG7C7GC7 


8410 


8420 


8430 


8440 


8450 


GCTAAGCTGC 


7GGC7GC7GA 


GG7CAAAGA7 


AAGAAGAC7G 


GAGAGA77C7 


8460 


8470 


■ 8480 


8490 


8500 


TCGCAAGCGT 


TGCGC7G7GC 


A77GGG7AAC 


7CC7GA7GG7 


77CCC7G7G7 


8510 


8520 


8530 


8540 


6550 


GGCAGGAATA 


CAAGAAGCC7 


A77CAGACGC 


GC77GAACC7 


GA7G77CC7C 


8560 


8570 


8580 


8590 


8600 


GG7CAG77CC 


GC77ACAGCC 


7ACCA77AAC 


ACCAACAAAG 


A7AGCGAGA7 


8610 


8620 


8630 


8640 


8650 


7GA7GCACAC 


AAACAGGAG7 


C7GG7A7CGC 


7CC7AAC777 


mm m m « *m m mm jm Mm. 

G7ACACAGCC 


8660 


8670 


8680 


8690 


8700 


AAGACGG7AG 


CCACC77CG7 


AAGAC7G7AG 


7G7GGGCACA 


CGAGAAG7AC 


8710 


8720 


8730 


8740 


8750 


GGAA7CGAA7 -C7777GCAC7 


GA77CACGAC 


7CC77CGG7A 


CCA77CCGGC 


8760 


8770 


8780 


8790 


86C0 


7GACGC7GCG 


AACC7G77CA 


AAGCAG7GCG 


CGAAAC7A7G 


G77GACACA7 


8810 


8820 


8830 


864 0 


8S5C 


A7GAG7C77G 


TGA7G7AC7G 


GC7GA777C7 
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CGC7GACCAG 


8860 


8870 


8e80 


8890 


8900 


77GCACGAG7 


C TC AA77G G A 


CAAAA7GCCA 


GCAC77CCGG 
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8910 
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8540 


8550 
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8960 


8970 


8960 


8990 


9000 
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FIGURE 9 (continued) 

AAA7CAA7AC GACCCGGA7C GGTCGACCTG CAGCCCAAGC TTGGCACTGG 

9010 9020 9030 9040 9050 

CCG7CG7777 ACAACGTCGT GAC7GGGAAA ACCCTGGCGT TACCCAAC7T 

9060 9070 9080 9090 9100 

AATCGCCTTG CAGCACATCC CCCCTTCGCC AGC7GGCG7A ATAGCGAAGA 

9110 9120 9130 9140 9150 

GGGCCGCACC GATCGCCCTT CCCAACAGTT GCGTAGCCTG AATGGCGAAT 

9160 9170 9180 9190 9200 

GGCGCTTTGC CTGGTTTCCG GCACCAGAAG CGGTGCCGGA AAGCTGGCTG 

9210 9220 9230 9240 9250 

GAGTGCGATC TTCCTGAGGC CGAQACNGTC GTCGTCCCCT CAAACTGGCA 

9260 9270 9280 9290 9300 

GATGCACGGT TACGATGCGC CCATCTACAC CAACGTAACC TATCCCATTA 

9310 9320 9330 9340 9350 

CGGTCAATCC GCCGTTTGTT CCCACGGAGA ATCCGACGGG TTGTTACTCG 

9360 9370 9380 9390 9400 

CTCACATTTA ATGTTGATGA AAGCTGGCTA CAGGAAGGCC AGACGCGAAT 

9410 9420 9430 9440 9450 

TATTTTTGAT GGCGTTCCTA TTGGTTAAAA AATGAGCTGA TTTAACAAAA 

9460 9470 9480 9490 9500 

ATTTAACGCG AATTTTAACA AAATATTAAC GTTTACAATT TAAATATTTG 

9510 9520 9530 9540 9550 

CTTATACAAT CTTCCTGTTT TTGGGGCTTT TCTGATTATC AACCGGGGTA 

9560 9570 9580 9590 9600 

CATATGATTG ACATGCTAGT TTTACGATTA CCGTTCATCG A77C7CT7G7 

9610 9620 9630 9640 9650 

TTGCTCCAGA CTCTCAGGCA ATGACCTGAT AGCC77TG7A GATCTCTCAA 

9660 9670 9680 9690 9700 

AAATAGCTAC CCTCTCCGGC ATGAATTTAT CAGCTAGAAC -GG77GAA7A7 

9710 9720 9730 9740 9750 

CATATTGATG GTGATTTGAC TGTC7CCGGC CTTTCTCACC CTTTTGAATC 

9760 9770 9780 9790 9800 

TTTACCTACA CATTACTCAG GCATTGCATT TAAAATATAT GAGGGX7C7A 

9810 9820 . 9830 9840 9850 

AAAATTTTTA TCCTTGCGTT GAAATAAAGG CT7C7CCCGC AAAAGTAT7A 

9860 9870 9880 9890 9900 

CAGGG7CA7A A7G7X777GG 7ACAACCGA7 77AGC777AT GC7C7GAGGC 

9910 9920 9930 9940 9950 

777A77GC77 AA7777GC7A A77C777GCC 77GCC7G7A7 GA777A77GG 

A7GTT 
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